Practical equivalence of hidden interpolation to true out-of-distribution generalization
Ascertain the extent to which performance gains achieved via hidden interpolation across a model’s training corpus—i.e., solving evaluation tasks by interpolating among training examples that implicitly cover the test distribution—are practically equivalent to genuine out-of-distribution generalization in large language models.
References
This is a valid perspective, but 1) then the deviation from the assumptions of empirical risk minimization should be explicitly noted, 2) it's unclear to what extent even perfect hidden interpolation would be practically equivalent to true OOD generalization.
— Soft Contamination Means Benchmarks Test Shallow Generalization
(2602.12413 - Spiesberger et al., 12 Feb 2026) in Section: Limitations and Future Work