- The paper introduces MambaDSSE, which integrates Koopman lifting and selective state-space models to linearize complex, nonlinear DSSE dynamics without explicit grid topology.
- It demonstrates superior performance with up to a 30% MAE reduction on large-scale grids, effectively handling low observability and abrupt state transitions.
- Through probabilistic state filtering and adaptive noise modeling, the approach robustly manages measurement scarcity and DER-induced volatility.
Selective State-Space Models for Koopman-based Data-driven Distribution System State Estimation
Introduction
The integration of distributed energy resources (DERs) has fundamentally transformed distribution networks, amplifying their temporal complexity and variability. Effective Distribution System State Estimation (DSSE) under these conditions is essential for reliable grid operation, but faces severe observability limitations, nonlinearity, and measurement scarcity. Traditional model-based and recent deep learning techniques both suffer from scaling and generalization deficiencies given the high-dimensional, dynamic, and under-determined nature of distribution grids.
This paper introduces MambaDSSE, a model-free, purely data-driven architecture that couples Koopman-theoretic lifting with an input-adaptive, selective state-space model (SSM) based on the Mamba architecture. The approach focuses on learning the intrinsic temporal evolution of the grid state, eschewing the need for physical models or explicit network topology knowledge. Probabilistic state filtering over the learned lifted latent space achieves robust estimation and uncertainty quantification, even under significant DER penetrations and sampling irregularities.
Methodology
MambaDSSE Framework
MambaDSSE leverages three key components:
- Learned Koopman Lifting: Input data—partial, noisy voltage and current phasors, as well as power injections—are projected into a high-dimensional latent space via a neural Koopman observable encoder. This enables linearization of the underlying nonlinear state dynamics.
- Selective Mamba SSM Backbone: The principal backbone is a structured, input-dependent SSM derived from the Mamba architecture. Each state variable is modeled with a dedicated SSM, and matrices At​, Bt​, and process noise covariance ΣtQ​ are generated via parameterizations conditioned on prior state and measurement history. Constraints such as diagonal structure, negative eigenvalues (Routh-Hurwitz stability), and positive-semi-definite covariance ensure computational stability and efficiency.
- Probabilistic State Estimation: A Gaussian state-space model implements Bayesian filtering, with the Mamba-generated matrices governing transition and uncertainty evolution. Diagonal measurement and process noise structure further streamlines recursive estimation.
Figure 1: Overview of MambaDSSE’s architecture: learned Koopman lifting, Mamba-based SSM for matrix generation, and probabilistic filtering pipeline.
The system is optimized end-to-end via a composition of prediction and reconstruction losses, enforcing that the Koopman-lifted observables evolve linearly in latent space while maintaining faithful recovery of the original nonlinear grid state.
Architectural and Training Details
All models are implemented without explicit knowledge of the grid topology, relying solely on available historical measurements. Latent dimensions for the Mamba backbone are set between 256–512, with Koopman lifting dimension expansions of up to 2× input size to ensure linear expressiveness. The training loss combines mean squared error on next-state prediction and reconstruction in the input space. Adam optimization, learning rate scheduling, and batch-based training are employed to stabilize convergence.
Experimental Results
State Estimation Accuracy and Scalability
MambaDSSE is evaluated on four SMART-DS feeders of increasing size (71, 435, 1573, and 6746 buses), in low-observability regimes—only 10% of measurements are accessible—and compared against both LSTM and Mamba-based sequence-to-point models (without filtering), as well as LSTM+SE models with and without state estimation components.
In all scenarios, MambaDSSE demonstrates superior MAE and RMSE performance, particularly as system size increases. There is a pronounced performance degradation for direct mapping approaches on larger feeders, whereas MambaDSSE maintains consistent accuracy. On the largest case (6746 buses), it achieves a 30% MAE reduction over all baselines.
Figure 2: Distribution of voltage magnitude and angle prediction errors for all buses in the 435-bus system.
Temporal Event Tracking and Long-Range Dependency Modeling
MambaDSSE exhibits strong fidelity during abrupt state transitions, most notably capturing voltage magnitude surges and phase angle shifts. Baseline models are prone to larger errors and unstable predictions during such events, but the selective SSM backbone of MambaDSSE preserves trajectory coherence and rapid adaptation.
Figure 3: Model predictions compared to ground truth during critical voltage and angle transitions; SSM-based architectures provide superior tracking, especially at breakpoints.
Long input sequence ablations further establish that only the Mamba-based selective SSM robustly exploits deeper history: LSTMs suffer performance decay as input length increases, indicating vanishing gradient issues and limited long-range context capture.
Figure 4: Performance as a function of input sequence length—Mamba backbone (blue) benefits from longer history, unlike LSTM (red).
DER Penetration and Sampling Robustness
Across systematically varied DER penetration levels (from 0% to "extreme" scenarios), MambaDSSE achieves the most accurate state estimation—up to 40% better MAE than the strongest baseline at high DER levels. Performance remains robust for both voltage magnitudes and especially critical angle estimation, even as absolute errors escalate with operational volatility.
Figure 5: MAE versus DER penetration shows lower and less variable errors for MambaDSSE compared to LSTM+SE baseline.
In sampling rate perturbation tests, only MambaDSSE maintains error invariance when shifting input resolutions from 1 to 15 minutes. The learnable step parameter Δt​ in the Mamba SSM backbone is pivotal in adapting to such irregular sampling, in contrast to the LSTM-based models whose errors expand by nearly 3×.
Computational Overhead
SSM-based models (with state estimation) incur roughly 2× the runtime per prediction relative to direct mapping architectures, mainly attributable to the recursive filtering steps. However, total inference cost remains tractable (e.g., 2.6 ms for 6746-bus system per prediction).
Discussion and Implications
The MambaDSSE model targets fundamental limitations in data-driven DSSE by:
- Shifting from direct input–output mappings to dynamic-system behavior modeling in a lifted Koopman space, enabling graceful scaling to high-dimensional, under-observed regimes.
- Harnessing selective SSM expressivity for long-range temporal dependency modeling, crucial for systems with highly time-correlated DER-driven dynamics.
- Achieving robustness against measurement noise, DER-induced volatility, varying sample rates, and incomplete topology knowledge.
- Enabling uncertainty quantification through probabilistic filtering, which is essential for operational risk management and integration into energy management systems.
Architecturally, the model paves the way for further developments in efficient filtering (parallelized Bayesian updates), physically-consistent regularization, and adaptive learning of measurement and process covariances directly from operational data.
Conclusion
MambaDSSE establishes state-of-the-art data-driven DSSE by fusing Koopman-theoretic model linearization with selective input-dependent SSMs and probabilistic filtering. The approach addresses the grand challenges of scalability, measurement scarcity, and system volatility induced by DER integration, validated across a comprehensive spectrum of large-scale, realistic grid scenarios. Its efficacy at handling long-range temporal dependencies, input irregularities, and high-dimensional uncertainties positions it as a robust tool for future data-enabled autonomous power system operation.