Papers
Topics
Authors
Recent
Search
2000 character limit reached

Finite-State Dimension in Automata Theory

Updated 8 February 2026
  • Finite-state dimension is a measure that quantifies the asymptotic information density of infinite sequences as perceived by finite automata using block-entropy rates and compressibility methods.
  • It admits multiple equivalent characterizations via finite-state compressors, block entropy, martingale strategies, and automatic Kolmogorov complexity, linking algorithmic information theory with ergodic theory.
  • Extensions to multihead and relative models reveal a nuanced hierarchy of randomness, underpinning applications in fractal geometry and predictive modeling of symbolic data.

Finite-state dimension quantifies the asymptotic information density of infinite sequences as perceived by finite automata. It is the prototypical quantitative effectivization of classical Hausdorff dimension for symbolic data, forming a rigorous bridge between algorithmic information theory, ergodic theory, and automata-theoretic complexity. Finite-state dimension has robust equivalent characterizations in terms of block entropy rates, finite-state compressibility, finite-state gambling strategies (martingales), and automatic @@@@1@@@@, and admits powerful extensions to multihead and relative (conditional) variants. Its study reveals both the automata-theoretic foundation of Borel normality and a fine-grained hierarchy of "randomness" and compressibility for infinite sequences.

1. Equivalent Characterizations

There are several mathematically equivalent ways to define the finite-state dimension dimFS(X)\dim_{FS}(X) of an infinite sequence XΣωX \in \Sigma^\omega over a finite alphabet Σ\Sigma, all of which yield a value in [0,1][0,1].

Block-Entropy Rate Definition:

Let k1k \geq 1. For aligned blocks of length kk in XX, form the empirical distribution Pk,NP_{k,N} over all wΣkw \in \Sigma^k in the first NN kk-blocks, and define the Shannon entropy Hk,N(X)=H(Pk,N)H_{k,N}(X) = H(P_{k,N}). The lower block entropy rate is Hk(X)=lim infNHk,N(X)H_k^-(X) = \liminf_{N \to \infty} H_{k,N}(X). Then

dimFS(X)=limkHk(X)k=infk1Hk(X)k\dim_{FS}(X) = \lim_{k \to \infty} \frac{H_k^-(X)}{k} = \inf_{k \geq 1} \frac{H_k^-(X)}{k}

The use of non-aligned (sliding-window) blocks or disjoint blocks yields the same limit (Kozachinskiy et al., 2017, Becher et al., 2024).

Finite-State Compressor (Automatic Kolmogorov Complexity):

An information-lossless finite-state compressor CC is a finite automaton transducer whose output and terminal state uniquely determine the input. The finite-state dimension is

dimFS(X)=infClim infnC(X[0n1])nlogΣ\dim_{FS}(X) = \inf_{C} \liminf_{n \to \infty} \frac{|C(X[0 \ldots n-1])|}{n \log |\Sigma|}

where the infimum is over all such CC (Kozachinskiy et al., 2017, Mayordomo, 2022).

Finite-State Gambling (Gales):

A finite-state ss-gale is a martingale process dd implementable by a finite automaton, such that d(w0)+d(w1)=2sd(w)d(w0) + d(w1) = 2^s d(w) for all ww (binary case). The dimension is

$\dim_{FS}(X) = \inf \{ s : \exists \text{ finite-state %%%%19%%%%-gale } d \text{ with } \sup_n d(X[0 \ldots n-1]) = \infty \}$

A strong variant using liminf instead of limsup defines the strong finite-state dimension $\Dim_{FS}(X)$ (Kozachinskiy et al., 2017, Lutz et al., 2021).

Automatic Kolmogorov Complexity and Superadditivity:

Let RR be a rational relation on Σ×Σ\Sigma^* \times \Sigma^* defined by a finite automaton; KAR(w)KA_R(w) is the minimal length of a description mapping to ww under RR. Then

dimFS(X)=infRlim infnKAR(X[0n1])n\dim_{FS}(X) = \inf_R \liminf_{n \to \infty} \frac{KA_R(X[0 \ldots n-1])}{n}

where the infimum is over all such automatic description modes (Kozachinskiy et al., 2017).

2. Foundational Properties and Normality

Borel normality, the property that every possible block of a given length appears with the expected limiting frequency, is the automata-theoretic threshold for maximal finite-state dimension:

Periodic or ultimately periodic sequences have dimFS(X)=0\dim_{FS}(X) = 0. Intermediate dimensions r(0,1)r \in (0,1) can be realized by mixing periodic and normal segments, and via explicit constructions such as Liouville numbers with prescribed finite-state dimension (Nandakumar et al., 2012).

For saturated sets XPX_P of all sequences over Σ\Sigma with limiting symbol frequencies PP, dimFS(XP)=H(P)\dim_{FS}(X_P) = H(P), where H(P)H(P) is the Shannon entropy. In such sets, every individual sequence SXPS \in X_P satisfies dimFS(S)=H(P)\dim_{FS}(S) = H(P); this is a pointwise, not merely almost-everywhere, property [0703085].

3. Information-Theoretic and Markov Chain Characterizations

Recent developments establish Markov chain-based and information-theoretic perspectives:

Markov Chain Characterization:

For X{0,1}X \in \{0,1\}^\infty, drive every irreducible Markov chain MM with the bits of XX. Let WM(X)W_M(X) be the set of limiting empirical edge-state distributions. Then

dimFS(X)=1supMsupμWM(X)DKL(μ(EQ)πM(EQ))\dim_{FS}(X) = 1 - \sup_{M} \sup_{\mu \in W_M(X)} D_{KL}(\mu(E|Q) \| \pi_M(E|Q))

where DKLD_{KL} is the conditional Kullback-Leibler divergence, and πM\pi_M is the stationary distribution of MM. For strong dimension, the roles of infimum and supremum are exchanged (Bienvenu et al., 21 Oct 2025).

This broadens the Schnorr-Stimm correspondence: Borel normality is equivalent to stationarity of all such simulated Markov chains. Finite-state dimension quantifies the degree of statistical divergence from this ideal (Bienvenu et al., 21 Oct 2025).

Gambling and Prediction Equivalence:

The equivalence of block-entropy, finite-state martingales, automatic Kolmogorov complexity, and compressor formulations is established via duality arguments, superadditivity, Kraft-type inequalities, and convexity of Shannon entropy (Kozachinskiy et al., 2017, S, 10 Feb 2025). These connections make explicit the precise sense in which finite automata can exploit regularities for compression, prediction, or statistical bias.

4. Multihead, Multi-bet, and Relative Dimensions

Finite-state dimension generalizes via more complex automata models:

Multihead Finite-State Dimension:

Multihead finite-state gamblers have hh-head architectures with oblivious, one-way movement. For each h1h \geq 1, the hh-head finite-state dimension dimh(S)\dim_h(S) is defined via hh-FSGs (finite-state gamblers), and

MHdim(S)=infh1dimh(S)\mathrm{MH}\,\dim(S) = \inf_{h \geq 1} \dim_h(S)

The 1-head case recovers classical dimFS\dim_{FS}, and a strict hierarchy holds: for each hh, there exist sequences for which dimh+1(S)<dimh(S)\dim_{h+1}(S) < \dim_h(S). Multihead dimension is stable under finite unions, but fixed-hh predimensions are not (Huang et al., 26 Sep 2025, Lutz, 20 Oct 2025).

Multi-bet (Product Gales) Dimension:

Product gales and kk-bet finite-state gamblers spread bets across kk separate accounts at each symbol. Multi-bet finite-state dimension matches classical finite-state dimension, and both are characterized by the limiting sliding (or disjoint) block entropy rate (S, 10 Feb 2025).

Relative and Conditional Dimension:

Relative finite-state dimension dimFSY(X)\dim_{FS}^Y(X) allows the automaton constant look-ahead or oracle access to another sequence YY. This is precisely characterized via conditional block entropies: dimFSY(X)=inflim infkH(X[:k]Y[:k])\dim_{FS}^Y(X) = \inf_\ell \liminf_{k \to \infty} H_\ell(X[:k\ell]\,|\,Y[:k\ell]) This framework unifies conditional normality and multidimensional selection principles, underpinning symmetries akin to van Lambalgen's theorem in algorithmic randomness (Nandakumar et al., 2023, Shen, 2024).

5. Structural Results, Examples, and Selection Principles

The dimension is robust under standard transformations:

  • Rational translations and scalings (e.g., XqX+rX \mapsto qX + r) preserve dimFS(X)\dim_{FS}(X) (Nandakumar et al., 2012, Clanin et al., 3 Jun 2025).
  • Polynomial images: Linear rational-coefficient polynomials preserve finite-state dimension, but higher-degree polynomials or real coefficients can alter it arbitrarily (Clanin et al., 3 Jun 2025).
  • Saturated sets: For symbol-frequency-constrained classes XPX_P, dimFS(XP)=H(P)\dim_{FS}(X_P) = H(P), matching Hausdorff and packing dimensions [0703085].

Selection Principles:

  • Agafonov's theorem and its extensions: For any regular (finite-state) selection rule, the dimension of any subsequence selected from XX is preserved up to sharply computable bounds via stationary mass, with equality in the normal/full-dimension case (Bienvenu et al., 21 Oct 2025).
  • Arithmetic progression subsequences: For XX and its dd-AP subsequences Ad,i(X)A_{d,i}(X),

dimFS(X)1d[dimFS(Ad,0)+i=1d1dimFSAd,0,...,Ad,i1(Ad,i)]\dim_{FS}(X) \geq \frac{1}{d} \left[ \dim_{FS}(A_{d,0}) + \sum_{i=1}^{d-1} \dim_{FS}^{A_{d,0},...,A_{d,i-1}}(A_{d,i}) \right]

with equality characterizing normality and facilitating strong converses to Wall's theorem (Nandakumar et al., 2023).

Liouville Numbers:

For every rational r[0,1]r \in [0,1] and base k2k \geq 2, there exists a Liouville number with dimFS=r\dim_{FS} = r, demonstrating the existence of transcendental numbers with prescribed compressibility profiles (Nandakumar et al., 2012).

6. Connections to Fractal Geometry, Entropy, and Computational Aspects

  • For saturated classes and many invariant subshifts, finite-state dimension coincides with Hausdorff dimension and (in the strong variant) packing dimension, ensuring alignment with geometric fractal measures [0703085].
  • The entropy-rate interpretation of finite-state dimension situates it as a lower asymptotic density, i.e., the minimal compression rate achievable via finite automata, or equivalently, the minimal average uncertainty per symbol as measured by empirical block frequencies (Becher et al., 2024, Lutz et al., 2021).
  • Rauzy's sliding-window predictor mismatch rates β(x)\underline{\beta}(x) and their sharp relationship to block entropies provide algorithmic, computable bounds:

2β(x)dimFS(x)H(β(x))2\underline{\beta}(x) \leq \dim_{FS}(x) \leq H(\underline{\beta}(x))

yielding practical methods for lower and upper bounding dimFS\dim_{FS} on observed data (Becher et al., 2024).

7. Point-to-Set Principle, Generalizations, and Modern Directions

Point-to-Set Principle:

Finite-state dimension admits a point-to-set characterization: for A[0,1)A \subseteq [0,1), dimFS(A)=minfdimf(A)\dim_{FS}(A) = \min_f \dim^{f}(A), where ff runs over "separator enumerators" densely enumerating points in [0,1)[0,1). This mirrors the classical result for Hausdorff dimension and provides an operational route to assessing dimension through automata-encoded precision information (Mayordomo, 2022).

Generalizations and Further Structure:

  • Multihead and multi-bet extensions allow increasingly rich automata architectures, revealing strict hierarchies and novel forms of stability and instability for unions and transformations (Huang et al., 26 Sep 2025, Lutz, 20 Oct 2025, S, 10 Feb 2025).
  • Weyl's criterion admits an extension: finite-state dimension can be characterized in terms of the infimum of lower average entropies of all weak subsequential limits of empirical measures from Weyl exponential sums, providing a bridge to harmonic analysis and number-theoretic randomness (Lutz et al., 2021).

The landscape of finite-state dimension is characterized by deep equivalences, structural invariance, and quantitative sensitivity to algorithmic and statistical properties of sequences. Its ongoing generalizations—multihead, multi-bet, and relative models—reflect a broadening theoretical interface between automata, information theory, fractal geometry, and the algorithmic foundations of randomness.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Finite-State Dimension.