Size-scaling of training stability and performance gains from learnable multipliers
Determine how training stability and performance improvements produced by learnable multipliers scale with model size across the large-language-model regime.
References
Yet, many questions are left open. How training stability and performance improvement of LRMs scale with model size is another practical question.
— Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers
(2601.04890 - Velikanov et al., 8 Jan 2026) in Section 6: Conclusion and discussion