Determining the contribution weight of residual information during denoising
Ascertain a principled scheme for weighting the contribution of residual information relative to mask embeddings when constructing input embeddings for masked positions during denoising in diffusion large language models.
References
Hence, the only question yet to be solved is to determine how much should residual information contribute.
— Residual Context Diffusion Language Models
(2601.22954 - Hu et al., 30 Jan 2026) in Section 4.1 (Entropy Weighted Residual)