Data-Driven Predictive Control (DPC)

Updated 4 February 2026

Data-Driven Predictive Control (DPC) is a technique that constructs future system trajectories as linear combinations of past recorded input-output data, eliminating the need for explicit system models.
It uses convex regularization techniques, such as 1-norm and 2-norm penalties, to manage noise and complexity, resulting in piecewise-affine predictors and lowered online computational demands.
DPC maintains feasibility through data redundancy and can match traditional Model Predictive Control performance, especially when explicit model identification is infeasible or expensive.

Data-Driven Predictive Control (DPC) is a class of receding-horizon control strategies that eliminate explicit model identification by parameterizing predictions directly as combinations of recorded input–output (I/O) trajectories. Rather than using a parametric model (state-space or input–output structure), DPC synthesizes predicted trajectories by reusing observed system data—subject to convex regularization and constraints—thereby enabling predictive control in settings where system identification is infeasible, unreliable, or too expensive. DPC has been extended to support regularization, nonlinearity, stochasticity, and application-specific architectural constraints, forming a rigorous alternative to classical model predictive control (MPC) in both theory and practice.

1. Foundational Principles and Problem Structure

At the core of DPC, the future I/O trajectory over a receding horizon is constructed as a linear combination of previously collected data segments. Two mathematically equivalent descriptions are prevalent:

Input–Output Setting:

Collect ℓ historical trajectories, each of length L, stack them as columns of a data matrix

$D = [\, [u^{(1)}; y^{(1)}] \;\cdots\; [u^{(\ell)}; y^{(\ell)}] ] \in \mathbb{R}^{(m+p)L \times \ell}$

For each new decision window, a generated trajectory is

$[u_{\rm gen}; y_{\rm gen}] = D a, \quad a \in \mathbb{R}^\ell$

Initial conditions, prediction horizon, and feasibility are enforced by partitioning D according to past and future blocks,

$[u_p; y_p] = W a, \qquad [u; y] = [U; Y] a$

State-Space Setting (Remark 1):

The same principle holds if state sequences are recorded or estimated, with invariance to the rank and feasibility structure.

Regularization is crucial to control model complexity and manage ill-posedness arising from noise or insufficient data. In particular, $1$-norm regularization introduces sparsity and a geometric bias favoring extreme points among the recorded trajectories, while $2$-norm (quadratic) or projection-based regularizations yield least-squares-like behaviors or enforce predictions onto trusted subspaces (Klädtke et al., 28 May 2025, Klädtke et al., 27 Mar 2025).

2. Role and Mechanisms of Regularization in DPC

2.1. 1-Norm Regularization

DPC with a $1$-norm penalty

$\min_{u \in \mathcal{U},\, y \in \mathcal{Y},\, a \in \mathbb{R}^\ell} J(z,u,y) + \lambda\|a\|_1 \quad \text{s.t.}\quad [z; y] = D a$

restricts model complexity via atom selection. Geometrically, $\|a\|_1$ is the atomic gauge of $\operatorname{conv}(D_{\pm})$ , where $D_{\pm}$ concatenates D and its mirrored columns $-D$ . Only data columns at extreme points of this convex hull are ever used, leading to a piecewise-affine (PWA), sparse predictor with a symmetry structure if the quadratic cost and regularizer are even functions. The predictor’s combinatorics and critical region (CR) boundaries scale homogeneously with λ, and offline extreme point removal drastically reduces online dimensionality without compromising optimality (Klädtke et al., 28 May 2025).

2.2. Quadratic and Output-Only Regularization

Quadratic regularization, i.e., $h(a) = \lambda\|a\|_2^2$ , endows the DPC OCP with a structure analogous to least-squares subspace predictive control (SPC), with trajectory-specific costs penalizing not just deviation from reference but also from the statistical mean and covariance characterized by the dataset. Explicit projection-based regularization—applied on the orthogonal complement of the data subspace—targets output fit only, closely mirroring the behavior of analytical subspace predictors (Klädtke et al., 27 Mar 2025).

2.3. Geometric and Theoretical Insights

The regularization cost for a target trajectory $w$ is characterized by the atomic norm associated with the convex hull of the reduced, sign-extended dataset,

$h^*(w) = \lambda \|w\|_{\overline{D}_\pm} = \lambda \inf \left\{\sum_i \alpha_i: w = \sum_i \alpha_i w^{(i)},\, \alpha_i \geq 0,\, w^{(i)} \in \overline{D}_\pm\right\}$

where $\overline{D}_\pm$ is the set of convex hull extreme points (Klädtke et al., 28 May 2025).

3. Implicit Predictors and Piecewise-Affine Structure

The DPC paradigm introduces the notion of implicit predictors, mapping current input/initialization data to the optimal predicted output without enforcing an explicit model constraint. Given the regularization and optimal control problem, for each fixed initialization, the prediction map is

$\hat{y}_\text{DPC}(z) = \arg\min_{y} \min_{a} \{ J(z,u,y) + h(a) \mid [z; y] = D a \}$

This map is PWA over critical regions defined by active data columns (atoms), and its symmetries mirror those of the cost and regularization. For quadratic costs and $1$-norm regularization, the implicit predictor is odd if $J_y$ and $h$ are even, i.e., $\hat{y}_\text{DPC}(-z) = -\hat{y}_\text{DPC}(z)$ (Klädtke et al., 28 May 2025). Regularization and the atomic norm structure shape the bias–variance and locality of these predictors (Klädtke et al., 2023, Klädtke et al., 27 Mar 2025).

4. Data Preprocessing, Feasibility, and Complexity

4.1. Extreme Point Pruning

The intrinsic sparsity stemming from $1$-norm regularization enables rigorous offline data reduction:

Compute the set of necessary data columns as convex hull extreme points (mirrored if using $1$-norm).
Discard all non-extreme columns.
The solution space and all optimal policies remain unchanged.

This reduces online QP dimensions and accelerates online evaluation. The atomic norm representation renders this operation both necessary and sufficient for minimizing redundancy (Klädtke et al., 28 May 2025).

4.2. Feasibility

Assuming the data matrix $D$ (or its augmented slack-extended version) is of full row rank, the DPC is always feasible for any initialization and input or output constraints. This recursive feasibility is structurally inherited from the data redundancy and the convex span formed by the data atoms (Klädtke et al., 27 Mar 2025).

4.3. Equivalence to Explicit MPC

In the deterministic (noise-free) setting with full row-rank data and appropriate persistence of excitation, explicit DPC and classical explicit MPC have identical QP complexity, the same PWA partitions (number of regions), and are algebraically equivalent following appropriate variable elimination and nullspace reduction steps (Klädtke et al., 2022). This shows that DPC entails no fundamental tractability disadvantages compared to standard MPC.

5. Symmetry, Model Structure, and Suitability

DPC’s implicit predictor inherits any symmetries in the cost and regularization. For linear time-invariant (LTI) systems (with $f(-x,-u) = -f(x,u)$ ), the odd symmetry induced by quadratic costs and $1$-norm regularization is well aligned. For nonlinear or asymmetric dynamics, this built-in symmetry can become a liability, resulting in enforced mirroring of the prediction map and systematic bias in unsymmetrical regions. Careful selection of λ, coordinate transformations, or alternative regularizations may be required if mirrored data biases are detrimental (Klädtke et al., 28 May 2025).

Numerical results, such as on the nonlinear system $f(x,u)=2x^2+2u^2-1$ , illustrate that the DPC predictor is forced through the origin and systematic biases can arise for such nonlinearities.

6. Design Guidelines and Practical Considerations

Effective DPC design with $1$-norm regularization mandates:

Sparse data selection: Use only convex hull extremes of the mirrored data to reduce online complexity.
Regularization tuning: Use λ to trade off tracking error against model complexity in a geometrically interpretable manner; scaling λ stretches the PWA regions and affects the degree of data locality.
Symmetry awareness: Ensure that the enforced predictor symmetry suits the true system’s dynamics; if not, consider alternative regularization schemes or data preprocessing.
Application match: $1$-norm DPC is especially suited for high-dimensional datasets where the true system order is low, when focusing on locally relevant behaviors, or when explicit PWA predictors are acceptable or beneficial.

Table: Summary of Key Properties of 1-Norm Regularized DPC

Property	Description	Reference
Data Usage	Only convex hull extreme points of (±data columns) are necessary	(Klädtke et al., 28 May 2025)
Predictor Structure	Piecewise-affine, odd (if costs/regs even), regions scale with λ	(Klädtke et al., 28 May 2025)
Symmetry	Predictor odd/even if cost and regularizer are even	(Klädtke et al., 28 May 2025)
Data Reduction	Extreme point pruning yields identical optimal solution/cost	(Klädtke et al., 28 May 2025)
Feasibility	Full-row-rank data ensures recursive feasibility (all initials)	(Klädtke et al., 27 Mar 2025)
Online Complexity	Reduced after pruning, accelerates evaluation and QP solution	(Klädtke et al., 28 May 2025)

The analytical framework established for $1$-norm DPC provides a rigorous geometric and operational foundation for DPC design and analysis, leading to robust, interpretable, and computationally efficient predictive control strategies in data-rich but model-poor environments.