Additional optimization flaws and non-learned matrix components
Identify further optimization-induced flaws in standard LLM training beyond the unlearned matrix scale and determine whether other components of parameter matrices, apart from row and column norms, fail to be learned automatically; develop corrective strategies to address such flaws.
References
It is an open question whether there are other flaws such kind and whether they can be corrected. For example, are there other parts of parameter matrices apart from row and column norms that are not learned automatically?
— Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers
(2601.04890 - Velikanov et al., 8 Jan 2026) in Section 6: Conclusion and discussion