Computational power of fixed-precision Transformers with positional encodings

Characterize the computational power of Transformer networks with positional encodings when all internal computations use fixed (finite) precision, including determining the class of languages recognizable by such fixed-precision Transformers.

Background

The paper proves Turing completeness of Transformers under the assumption of arbitrary-precision internal representations, enabled in part by positional encodings. It then observes that under fixed precision, positional encodings map to a finite set, effectively enlarging the alphabet and implying the resulting model is not Turing complete. This raises the question of what formal language recognition power remains under fixed precision.

The authors explicitly identify the need to study the computational power of fixed-precision Transformers as future work, asking for a theoretical characterization beyond the established non-Turing completeness result.

References

Although no longer Turing complete, one can still study the computational power of fixed-precision Transformers. We left this as future work.

— On the Turing Completeness of Modern Neural Network Architectures (1901.03429 - Pérez et al., 2019) in Section 3.3, Differences with [Transformer]'s framework — The need of arbitrary precision

Computational power of fixed-precision Transformers with positional encodings

Background

References

Related Problems