Characterize additional effects of QAT beyond weight oscillations
Identify and characterize the additional training effects introduced by Quantization-Aware Training beyond weight oscillations, specifying their nature and how they influence optimization dynamics and performance at different quantization bit-widths.
References
On the other hand, while it is not clear what the additional effects are during QAT, we do note two consistent deviations from the QAT performance when using our regularization method: QAT outperforms regularization at ternary quantization, whereas our regularization method outperforms QAT in cross-bit accuracy for the ternary and 3-bit case.
— Oscillations Make Neural Networks Robust to Quantization
(2502.00490 - Wenshøj et al., 1 Feb 2025) in Section 6 (Discussion)