A Theoretical Framework for OOD Robustness in Transformers using Gevrey Classes

Published 17 Apr 2025 in cs.LG | (2504.12991v2)

Abstract: We study the robustness of Transformer LLMs under semantic out-of-distribution (OOD) shifts, where training and test data lie in disjoint latent spaces. Using Wasserstein-1 distance and Gevrey-class smoothness, we derive sub-exponential upper bounds on prediction error. Our theoretical framework explains how smoothness governs generalization under distributional drift. We validate these findings through controlled experiments on arithmetic and Chain-of-Thought tasks with latent permutations and scalings. Results show empirical degradation aligns with our bounds, highlighting the geometric and functional principles underlying OOD generalization in Transformers.