Source of gains from arbitrary-order decoding in diffusion language models
Determine whether the performance gains attributed to arbitrary-order token generation in diffusion language models primarily arise from improved exploitation of existing solution patterns encoded in the training data and model, or from enabling qualitatively new reasoning strategies that are unattainable under purely autoregressive left-to-right decoding. Clarify the causal contribution of order arbitrariness to reasoning performance in standard domains such as mathematics and code generation.
References
Despite these advances, it remains unclear whether the observed gains primarily arise from better exploitation of existing solution patterns encoded in the data and model, or whether order arbitrariness itself enables qualitatively new reasoning strategies that are unattainable under a purely autoregressive decoding regime.