Tightness of padding bounds for transformer recognition of context-free languages

Determine whether the current padding-token upper bounds for transformer-based recognition of context-free languages are tight. Specifically, ascertain the minimal number of padding tokens required for averaging hard-attention transformers with logarithmically looped layers and log-precision arithmetic to recognize (i) all context-free languages using O(n^6) padding, (ii) unambiguous context-free languages using O(n^3) padding, and (iii) unambiguous linear context-free languages using O(n^2) padding; either establish matching lower bounds or construct recognition algorithms that use asymptotically fewer padding tokens for these classes.

Background

The paper proves that looped transformers with O(log n) iterations and padding can recognize context-free languages, providing explicit upper bounds on the number of padding tokens needed: O(n6) for general CFLs, O(n3) for unambiguous CFLs, and O(n2) for unambiguous linear CFLs. Padding tokens serve as additional computational space to parallelize the recognition algorithms.

While the log-depth requirement is provably necessary under standard complexity assumptions (improving to fixed depth would imply TC0 = NC1), the authors do not establish tightness of their padding bounds. They note that practical concerns motivate seeking more space-efficient constructions or proving minimality of the current bounds.

References

Although it is not possible to improve our $\log$-depth recognition algorithm to fixed depth unless ${0} = {1}$, our padding bounds are not known to be tight.

Context-Free Recognition with Transformers  (2601.01754 - Jerad et al., 5 Jan 2026) in Discussion and conclusion