Universality Class of the Grokking Dimensional Crossover

Determine the precise universality class of the dimensional crossover observed at the grokking transition in neural network training, in which the effective dimensionality D, obtained via finite-size scaling of gradient avalanche dynamics, crosses from sub-diffusive (D < 1) to super-diffusive (D > 1).

Background

The paper presents evidence that grokking corresponds to a dimensional phase transition in gradient space: the effective dimensionality D, extracted via finite-size scaling of gradient avalanche dynamics (using a threshold-driven diffusion probe inspired by the Olami-Feder-Christensen model), evolves from sub-diffusive (D ā‰ˆ 0.90) to super-diffusive (D ā‰ˆ 1.20), crossing D ā‰ˆ 1 at generalization onset.

The authors show topology invariance and heavy-tailed avalanche statistics consistent with self-organized criticality, and identify two statistically distinct pre- and post-grokking scaling regimes. Despite these findings, the precise universality class governing this dimensional crossover is not yet identified.

References

The precise universality class of this dimensional crossover remains an open question.

Grokking as Dimensional Phase Transition in Neural Networks  (2604.04655 - Wang, 6 Apr 2026) in Main text, paragraph beginning "Our results reframe grokking as a measurable dimensional phase transition...", immediately after Figure 3 and before Acknowledgments.