T$^\star$: Progressive Block Scaling for MDM Through Trajectory Aware RL

Published 16 Jan 2026 in cs.CL | (2601.11214v1)

Abstract: We present T$^\star$, a simple \textsc{TraceRL}-based training curriculum for progressive block-size scaling in masked diffusion LLMs (MDMs). Starting from an AR-initialized small-block MDM, T$^{\star$~transitions} smoothly to larger blocks, enabling higher-parallelism decoding with minimal performance degradation on math reasoning benchmarks. Moreover, further analysis suggests that T$^\star$~can converge to an alternative decoding schedule $\hat{\rm S}$ that achieves comparable performance.