Characterize additional effects of QAT beyond weight oscillations

Identify and characterize the additional training effects introduced by Quantization-Aware Training beyond weight oscillations, specifying their nature and how they influence optimization dynamics and performance at different quantization bit-widths.

Background

The paper argues that weight oscillations observed during Quantization-Aware Training (QAT) are not merely artifacts but can be beneficial for quantization robustness. It develops a regularization method (OsciQuant) that induces oscillations and demonstrates competitive performance with QAT at 3–4 bits, and improved cross-bit robustness in several settings.

Despite these findings, the authors note that QAT may introduce additional effects beyond oscillations. They observe consistent deviations between QAT and their regularization method—particularly at ternary precision and in cross-bit performance—suggesting there are other training dynamics in QAT not captured by oscillations alone. Determining what these additional effects are remains unclear and is explicitly flagged as an unresolved issue.

References

On the other hand, while it is not clear what the additional effects are during QAT, we do note two consistent deviations from the QAT performance when using our regularization method: QAT outperforms regularization at ternary quantization, whereas our regularization method outperforms QAT in cross-bit accuracy for the ternary and 3-bit case.

Oscillations Make Neural Networks Robust to Quantization  (2502.00490 - Wenshøj et al., 1 Feb 2025) in Section 6 (Discussion)