Discriminative Sampling for Dynamical Systems
- Discriminative sampling is a targeted methodology that selects data-rich states based on metrics like entropy and Fisher Information to enhance model verification and discovery in dynamical systems.
- It employs algorithmic strategies such as SVM-margin minimization and GP entropy maximization to reduce sample complexity and improve estimation accuracy in complex, non-linear models.
- Applications span control system verification, data-driven model recovery, and robust inverse problem solving, demonstrating significantly lower error rates compared to random sampling.
Discriminative sampling for dynamical systems refers to a wide range of principled methodologies that actively focus sampling resources on those states, inputs, trajectories, or uncertainty realizations which maximize information gain, parameter identifiability, error reduction, or decision-margin, rather than relying on random, uniform, or purely passive experimental designs. This paradigm, spanning from Bayesian optimal design and active learning to entropy-based and information-theoretic acquisition, has been instrumental in verification, inference, model discovery, rare-event sampling, and inverse problems involving dynamical systems of various types.
1. Mathematical Formulation and Core Principles
Discriminative sampling targets the selection of data—either through simulation, policy design, or experimental intervention—that most effectively improves a task-specific criterion. The selection may be guided by maximizing:
- Prediction entropy or uncertainty (e.g., binary classification entropy, as in Gaussian process regression),
- Expected model change, as via support vector machine (SVM) margin,
- Statistical information metrics—notably the Fisher Information Matrix (FIM), Shannon entropy, and effective rank,
- Gradient-based criteria for objective optimization (e.g., maximizing separation in model discrimination).
These criteria are tightly coupled to the underlying dynamical system. For ODE/PDE model identification, the FIM is computed via data-derived libraries (e.g., SINDy), while in particle filter applications, the learned proposal distribution is optimized to match the posterior density, thus focusing particle resources where inference is most sensitive (Bao et al., 17 Dec 2025, Gama et al., 2023). For simulation-based verification and safety, discriminative sampling interrogates parametric uncertainty sets where satisfaction of temporal logic specifications is most ambiguous or is most likely to reduce misclassification (Quindlen et al., 2017, Quindlen et al., 2017).
2. Algorithmic and Methodological Approaches
A broad class of discriminative sampling algorithms is summarized in Table 1.
| Domain | Discriminative Sampling Mechanism | Core Metric / Acquisition Function |
|---|---|---|
| Simulation-based verification | SVM-margin minimization (Quindlen et al., 2017), GP entropy-max (Quindlen et al., 2017) | |
| Data-driven model discovery | FIM increment, entropy search (Bao et al., 17 Dec 2025) | ΔTr I, Δln det I, Δentropy |
| Model discrimination/fault detection | Quadratic input design + SDP relax. (Cheong et al., 2013) | Min/max separation between models |
| Stochastic inference, filtering | Unsupervised proposal learning (Gama et al., 2023) | Max E[log-likelihood] under q_θ |
| Inverse SDE problems | Residual maxima (RBMS) (An et al., 2023) | p(x) ∝ residual(x) |
Simulation-based verification utilizes active learning loops. For binary verification, an uncertainty grid is constructed, each sample is labeled via metric temporal logic robustness, and an SVM classifier with RBF kernel is trained. At each iteration, points closest to the SVM decision boundary are sampled to maximize expected model change; batch selection introduces a diversity term via cosine kernel similarity. Misclassification error and empirical variance are both dramatically reduced relative to both random sampling and analytical certificates (Quindlen et al., 2017).
Gaussian process (GP) entropy-based algorithms for closed-loop verification use posterior predictive entropy over the specification function, with batch selection enforcing diversity via k-DPP (determinantal point processes). This approach achieves 20–50% lower misclassification error versus passive and variance-based sampling, strictly outperforming alternatives in >88% of reported tests (Quindlen et al., 2017).
Data-driven model discovery employs incremental FIM- or entropy-based selection. Segments (already available or candidate) are scored per Fisher, A-optimality, D-optimality, or entropy, and at each step, the segment with maximal marginal improvement is selected. Bootstrapped bagging further enhances effective rank and estimator stability. Adaptive switching between coarser and finer temporal/trajectory sampling is performed as information metrics cross specified thresholds. These principled choices yield dramatic gains in model recovery under data constraints: in multiple classical and chaotic systems, 95% recovery is achieved using only 10–30% of data required by random sampling (Bao et al., 17 Dec 2025).
3. Information-Theoretic and Statistical Underpinnings
Information theory is foundational for discriminative sampling in dynamical systems. The central objects are:
- Fisher Information Matrix (FIM) , which quantifies parameter identifiability:
For linear-in-parameter models, , where is the design (library) matrix. Design choices (e.g., maximizing or ) lead to maximally informative trajectories, segments, or initial conditions (Bao et al., 17 Dec 2025).
- Shannon entropy and sample entropy metrics gauge unpredictability and data complexity within windowed segments—maximizing these can yield higher discovery efficiency but may not directly optimize identifiability.
- Residual error landscapes in inverse modeling are exploited to allocate new samples at residual peaks (local maxima). Focusing retraining on regions with high current error improves data efficiency and generalization, especially when multiple error modes or discontinuities are present (An et al., 2023).
4. Applications and Case Studies
Discriminative sampling underpins several key applications:
- Verification of nonlinear/adaptive control systems: Active SVM and GP-based discriminative sampling frameworks reduce misclassification errors below 2.5% for Van der Pol and below 5% in more complex 4D saturated adaptive-control cases—systematically outperforming passive sampling and analytical certificates within fixed simulation budgets (Quindlen et al., 2017, Quindlen et al., 2017).
- Data-driven discovery of governing equations: FIM-informed adaptive sampling yields 95% recovery probability for Rössler/Van der Pol with only 10% of segments versus 60% for random sampling, and entropy-search over initial conditions lowers SINDy coefficient loss by 40% over random next-initial selection (Bao et al., 17 Dec 2025).
- Inverse problems with neural-ODE surrogates: Residual-based multi-peak sampling (RBMS) reduces training points to 20–30% of a full grid. By detecting and sampling all local residual maxima, RBMS outperforms methods that sample only highest-error points (e.g., RAR) or partitioned continuous sampling; empirical robustness and resistance to overfitting under initial-value noise is also improved (An et al., 2023).
- Input design for model discrimination and fault detection: Discriminative input sequences determined via nonconvex quadratic or SDP relaxation maximize separation between dynamic models, enabling robust isolation under noise and limited experiments (Cheong et al., 2013).
- Particle filtering of dynamical systems: Discriminatively learned proposal distributions (MLP, RNN, GNN, invertible flow) in particle filtering concentrate particles in high-likelihood (posterior) regions, yielding significant gains in effective sample size and RMSE versus hand-designed proposals, especially in nonlinear and high-dimensional settings (Gama et al., 2023).
5. Theoretical and Practical Performance
Empirical and theoretical analyses indicate that discriminative sampling frameworks achieve:
- Reduction of sample complexity: Factors of 2–5 improvement in required simulations or experiments for target error levels, with gains especially pronounced in high-dimensional or chaotic regimes.
- Variance and robustness improvements: Lower variance in estimator error and increased resilience to measurement noise or unfavorable initial conditions, as seen in SINDy-bagging or residual-distribution-based sampling (Bao et al., 17 Dec 2025, An et al., 2023).
- Algorithmic convergence and stability: For gradient/saddle-point flows (e.g., for system L₂ gain or passivity), local and global convergence proofs apply, and robust practical rates are observed (Koch et al., 2019).
- Handling multimodality and structural complexity: Techniques such as residual-based multi-peak sampling or graph-structured particle filtering enable discriminative sampling in the presence of multimodal posterior structures or complex spatio-temporal dependencies (An et al., 2023, Gama et al., 2023).
6. Limitations, Caveats, and Practical Considerations
- Computational cost: FIM and entropy computation is O(q²m) or higher per segment; careful precomputation and thresholding are sometimes necessary for scalability in high dimensions (Bao et al., 17 Dec 2025).
- Overfitting and ill-conditioning: Over-emphasis on high-residual or high-information regions may bias parameter estimates or inflate the condition number; spectral regularization or bagging is recommended to mitigate these effects (Bao et al., 17 Dec 2025, An et al., 2023).
- Hyperparameter selection: For some acquisition criteria (e.g., entropy, D-optimality), the choice of information metric and surrogate model parameters can affect efficacy and stability, necessitating cross-validation or pilot studies.
- No free lunch for worst-case design: Semidefinite relaxations in model-discriminative input design are exact only in restricted cases; in high-cardinality models, the gap may grow and approximation is traded for tractability (Cheong et al., 2013).
- Interpretability versus flexibility: Highly flexible methods (e.g., invertible neural flows in filtering) increase expressivity but may be less interpretable and require more computational resources (Gama et al., 2023).
7. Outlook and Generalization
Discriminative sampling has seen rapid methodological unification, connecting classical optimal design (Fisher, D-optimality), Bayesian active learning, and contemporary machine learning-based active sampling. The trend emphasizes principled, data- and resource-efficient strategies as foundational for robust, noise-tolerant, and scalable model verification, discovery, and statistical inference in dynamical systems. Application areas span control, physics, biological modeling, and engineering, with widespread adoption in simulation-based verification, system identification, experimental design, fault detection, Bayesian inference, and neural surrogate-based inverse problems. The continued development of computationally-efficient, theoretically-grounded, and robust discriminative sampling schemes remains a central concern for modern dynamical systems science and engineering.