Improving the number of pretraining batches required for regret guarantees
Ascertain whether the sufficient condition on the number of pretraining batches M in the permutation‑invariant ERM setup—namely M ≥ exp(C((log^2 n)/(log log n) + B_n)) for universal prior‑on‑priors with rate B_n—can be improved while retaining the same regret bound, potentially by imposing additional structural assumptions on the estimator \widehat{\theta}^n.
References
This is superpolynomial in $n$, but it agrees with the practical intuition that pretraining typically requires a huge amount of training data. We leave it to future study if a better condition on $M$ can be obtained by assuming different structures of $\widehat{\theta}n$.
— Universal priors: solving empirical Bayes via Bayesian inference and pretraining
(2602.15136 - Cannella et al., 16 Feb 2026) in Section 4.1 (Finite number of batches)