SpaceTime Exponential Autoregressive Dynamics

Updated 12 January 2026

SpaceTime's exponential autoregressive activities are defined by state-space models that exhibit rapid gradient growth through eigenvalue dynamics and companion parameterization.
Horizon Activation Mapping (HAM) quantitatively measures these exponential signatures, enhancing model interpretability and enabling targeted optimization strategies.
This approach offers practical benefits in model comparison, training efficiency, and controlled gradient propagation for accurate long-horizon time series forecasting.

SpaceTime's Exponential Autoregressive Activities refer to the phenomenon in which the SpaceTime state-space model (SSM) for time series forecasting exhibits an exponential amplification of gradients and hidden state activations across forecasting horizons. Rooted in the eigenstructure of its discrete-time parameter matrices, this signature is quantitatively revealed by Horizon Activation Mapping (HAM), yielding both theoretical and practical implications for model behavior, interpretability, and architecture selection. SpaceTime's SSM leverages a companion-matrix parameterization to recover AR(p) processes exactly, enabling autoregressive feedback with efficient O(d log d + ℓ log ℓ) training and inference complexity. The exponential autoregressive activity manifests when eigenvalues associated with the state propagation matrix exceed unit modulus, which in turn is detected and measured by the HAM technique as horizon-indexed exponential growth rates in gradient or activation norms (Krupakar et al., 5 Jan 2026, Zhang et al., 2023).

1. Discrete State-Space Fundamentals and Companion Parameterization

SpaceTime operates on the canonical discrete-time SSM recurrence: $\begin{align*} s_t &= A\,s_{t-1} + B\,u_t \ y_t &= C\,s_t + D\,u_t \end{align*}$ where $s_t \in \mathbb{R}^n$ is the hidden state, $u_t \in \mathbb{R}^m$ is the input (including autoregressive feedback), and $y_t \in \mathbb{R}^k$ is the output. The state transition matrix $A \in \mathbb{R}^{n \times n}$ is typically chosen to be (block-)diagonalizable or explicitly in companion form, enabling exact AR(p) process emulation: $A = \begin{bmatrix} 0 & \cdots & 0 & -a_p \ 1 & & 0 & -a_{p-1} \ & \ddots & & \vdots \ 0 & \cdots & 1 & -a_1 \end{bmatrix}, \quad x_t = \begin{bmatrix} y_{t-1} \ \vdots \ y_{t-p} \end{bmatrix}$ This structure ensures that the AR(p) characteristic polynomial is exactly matched, and all roots and associated dynamics are under direct parameter control (Zhang et al., 2023).

2. Mechanisms Inducing Exponential Autoregressive Activity

Exponential autoregressive activity in SpaceTime arises from three tightly coupled mechanisms:

Eigenstructure of the State Transition: If the largest eigenvalue $|\lambda_{\max}|$ of $A$ exceeds 1, repeated application leads to exponential growth in the components of the hidden state and corresponding gradients.
Autoregressive Feedback Loop: During training, SpaceTime feeds its predictions, $y_t$ , back into future inputs, ensuring that the loss gradient at horizon step $i$ propagates through $i$ applications of $A$ , amplifying exponential signatures for large $|\lambda|$ .
Auxiliary Losses and Forecast Masking: Even when intermediate losses are masked and the forecasting branch is disabled, the state's gradient path continues to reflect repeated application of $A$ , preserving the exponential activity though with reduced magnitude (Krupakar et al., 5 Jan 2026).

Mathematically, for a single-step regression loss at horizon index $i$ : $||\nabla_{s_0} L||_2 \approx K |\lambda_{\max}|^i$ defining the exponential autoregressive activity at horizon step $i$ as: $g(i) = \mathbb{E}_{\text{batch,}\,\theta}\big[\,‖\nabla_{\theta}\,\ell_{1-\exp}(M_c(i),\,x,\,y)‖_2\,\big] \approx \alpha\,e^{\beta\,i}, \quad \beta \sim \ln|\lambda_{\max}|$ where $\ell_{1-\exp}$ denotes the loss masked to a single horizon step (Krupakar et al., 5 Jan 2026).

3. Empirical Identification via Horizon Activation Mapping

Horizon Activation Mapping (HAM) provides a model-agnostic framework to measure and visualize the exponential nature of SpaceTime's autoregressive activities. In HAM’s “causal mode”, the mean gradient norm at horizon index $i$ is computed as: $g_c(i) = \mathbb{E}_{g \in \text{batches}} \big\| \nabla_{\theta} \left(\frac{1}{H} \sum_{t=1}^H M_c(i, H)_t \cdot \ell_t(y_t, \hat{y}_t)\right) \big\|_2$ where $M_c(i, H)_t$ is a binary mask selecting timesteps $t \leq i$ .

Key empirical findings from HAM in SpaceTime on the ETTm2 benchmarking suite include:

For short horizons ( $H=96$ ), $g_c(i)$ is nearly linear, indicating eigenvalues near the unit circle ( $\beta \approx 0$ ).
For longer horizons ( $H=192, 336, 720$ ), $g_c(i)$ transitions to exponential, with log-plot slopes $\beta \approx 0.0025, 0.0038, 0.0044$ respectively, and fold–increases up to 2.8x for $H=720$ .
The anti-causal gradient norm $g_a(i)$ exhibits exponential decay, and the gradient-equivariant point (where $g_c(h^*) = g_a(h^*)$ ) shifts earlier as the horizon length increases.
Masking the forecasting branch (zero loss) roughly halves gradient magnitude but preserves exponential rate $\beta$ (Krupakar et al., 5 Jan 2026).

4. Theoretical and Practical Implications

The emergence of exponential autoregressive activity in SpaceTime has direct implications for both optimization and forecasting:

Long-range Gradient Allocation: Exponential $g_c(i)$ indicates increasing gradient allocation to distant forecast steps, beneficial for long-horizon accuracy but potentially leading to gradient overemphasis on late-horizon noise.
Gradient Explosions and Model Stiffness: For $|\lambda| > 1$ , early input gradients can explode; HAM allows immediate diagnosis by plotting $g_c(i)$ , guiding interventions such as reducing state dimension $n$ , constraining spectral radii, or adding dropout.
Model Comparison and Selection: The exponential rate $\beta$ and area difference between $g_c(i)$ and a uniform line provide robust metrics for comparing SpaceTime with alternative architectures (NHITS, FEDformer, etc.) and matching model kernel decay to the dataset's empirical autocorrelation (Krupakar et al., 5 Jan 2026).

5. Expressivity, Efficiency, and Performance in Forecasting

SpaceTime's exponential autoregressive activities are rooted in its expressivity and computational design:

Exact Recovery of AR(p): SpaceTime matches ground-truth transfer functions and time-domain predictions for AR(4), AR(6) synthetic benchmarks, outperforming prior SSMs such as S4 and S4D.
Long-Horizon Generalization: On ETTh Informer multivariate forecasting tasks (horizons 96, 192, 336, 720+), SpaceTime achieves lowest MSEs in most settings, with graceful error scaling on longer unseen horizons.
Training Efficiency: Leveraging an FFT-based sequence map with a low-rank trick, SpaceTime attains O(d log d + ℓ log ℓ) complexity, producing 2×–5× speedup in wall-clock time over Transformers and LSTMs for long sequence data (Zhang et al., 2023).

Horizon	NLinear MSE	SpaceTime MSE
720	0.080	0.076
960	0.089	0.074
1800	0.102	0.081

HAM area-under-curve metrics correlate tightly with validation error, enabling informed early stopping and batch-size tuning (Krupakar et al., 5 Jan 2026, Zhang et al., 2023).

6. Relation to Broader Autoregressive and Extrapolation Paradigms

Exponential autoregressive activity in neural and hybrid models extends beyond SSMs. In the domain of quantum spin dynamics, autoregressive MLPs trained on local spacetime blocks have demonstrated exponential reach—i.e., prediction windows extending exponentially in the product of time and space grid size—for fixed parameter count, far surpassing the linear scaling of wavefunction-based simulators (DMRG, TEBD) (Pugzlys et al., 15 Dec 2025). A plausible implication is that carefully structured autoregressive feedback mechanisms, whether via SSMs or MLPs, can leverage exponential activity patterns for effective extrapolation in highly complex sequence and dynamical systems.

7. Regularization and Stability Considerations

The exponential nature of SpaceTime’s autoregressive activities makes stability and regularization essential:

If exponential growth rate $\beta$ is too high, vanishing/exploding gradients may preclude effective learning.
HAM’s direct visualization supports spectral clipping, dropout insertion, or model size reductions to manage optimization “stiffness.”
Monitoring when $\beta$ plateaus during training acts as an early-overfitting signal, further supporting robust training and model generalization (Krupakar et al., 5 Jan 2026).

In summary, SpaceTime's exponential autoregressive activities are a direct consequence of its companion-parameterized state-space architecture and feedback mechanisms, illuminate key strengths and limits in long-horizon forecasting, and are quantifiable via HAM. These methods enable not only efficient and expressive time series modeling but also contribute to the development and interpretability of neural forecasting architectures with exponential propagation behaviors (Krupakar et al., 5 Jan 2026, Zhang et al., 2023).