Overtaking Optimality in Control & MDPs

Updated 21 January 2026

Overtaking Optimality is a concept defining eventual performance dominance in long-horizon and infinite-horizon systems where a policy consistently outperforms alternatives.
Formal definitions—strong, average, and weak overtaking optimality—support analysis in optimization problems with divergent, oscillatory, or ill-posed objectives in control and MDPs.
In practical applications like autonomous vehicle trajectory planning and traffic control, overtaking optimality guides robust policy design despite non-integrable or oscillatory cost functions.

Overtaking optimality is a foundational concept in long-horizon and infinite-horizon optimization, control theory, Markov decision processes, and practical motor and traffic control applications. The central principle is to define optimality in terms of eventual domination: a policy or control is overtaking optimal if, after a (possibly finite) transient, it performs at least as well as any competitor for all sufficiently long horizons. This concept is essential when objective functionals diverge, oscillate, or are ill-posed in the classical sense, as in zero-discount Markov decision processes or infinite-horizon control with non-decaying payoffs.

1. Formal Definitions and Main Criteria

Overtaking optimality admits several closely related formal definitions, tailored to the structure of the underlying control or decision process:

Strong Overtaking Optimality: For two reward streams $u=(u_1,u_2,\ldots)$ and $v=(v_1,v_2,\ldots)$ in $U \subset \mathbb{R}^\mathbb{N}$ ,

$u \succeq_{O} v \quad \iff \quad \liminf_{T \rightarrow \infty} \sum_{t=1}^T (u_t - v_t) \ge 0.$

Average Overtaking Optimality (AO):

$u \succeq_{AO} v \quad \iff \quad \liminf_{T \rightarrow \infty} \frac{1}{T} \sum_{t=1}^T \sum_{k=1}^t (u_k - v_k) \ge 0.$

Overtaking Optimality in Control (OO): For a control $u^*$ and any other admissible $u$ ,

$\limsup_{T \rightarrow \infty} [J^T(u^*) - J^T(u)] \leq 0,$

where $J^T(u) = \int_0^T g(t, x^u(t), u(t))\,dt$ is the finite-horizon cost (1512.01206, Khlopin, 2017, Khlopin, 2012).

Weak Overtaking Optimality (WOO): As above, replacing $\limsup$ with $\liminf$ .

In Markov Decision Processes (MDPs), overtaking optimality is defined via finite-horizon comparison of cumulative reward or reachability probabilities; a policy overtakes another if, from some horizon onwards, its performance is strictly better at every time step. In control, overtaking optimality replaces pointwise comparison of divergent integrals with eventual superiority in all large truncations.

2. Axiomatic and Existence Theory in Markov Decision Processes

In zero-discount ( $\beta=1$ ) or average-reward MDPs, classical optimality (maximizing $\sum_{t=1}^\infty u_t$ ) is not well-posed. An axiomatic approach considers robust preorders on bounded reward streams, seeking criteria under which stationary overtaking-optimal policies exist (Jonsson, 2017).

Key findings include:

Strong overtaking optimality does not guarantee the existence of stationary overtaking-optimal policies, even in finite deterministic MDPs (example: alternating rewards $u=(2,0,2,0,\ldots)$ vs. $v=(c,2,0,2,\ldots)$ with $0 < c < 2$).
Milder criteria such as average-overtaking and $0$-discount (Blackwell) optimality always admit stationary optimal policies under compact sets of preference axioms (monotonicity, anonymity, tail dominance, stationarity, translation-invariance, or a compensation principle).
The rigidity theorem: Any preorder optimal on stationary policies and satisfying standard axioms must coincide with the average-overtaking/Blackwell criterion (Jonsson, 2017).

These results contextualize overtaking optimality as the maximal selective criterion that still yields tractable policy existence.

3. Overtaking Optimality in Infinite-Horizon Optimal Control

Overtaking optimality is indispensable in infinite-horizon optimal control when the improper integral cost is divergent or oscillatory. Definitions rest on limiting superiority in cumulative cost:

$u^* \text{ is overtaking optimal if } \forall u, \; \limsup_{T \to \infty} [J^T(u^*) - J^T(u)] \le 0.$

Pontryagin's Maximum Principle (PMP) delivers necessary conditions, but requires nonclassical transversality:

Vanishing shadow-price condition: The adjoint (costate) arc must satisfy a boundary condition at infinity (e.g., $\lim_{t \to \infty} \psi(t) = 0$ fails in many models; instead, a Cauchy-type formula involving the improper integral of the running cost gradient provides a necessary and often sufficient selection criterion) (1512.01206, Khlopin, 2017, Khlopin, 2012).
Existence/uniqueness of overtaking-optimal multipliers (adjoints) is ensured under mild growth or monotonicity conditions (e.g., monotone systems, subexponential discount) (Khlopin, 2012, Khlopin, 2017).
Classical transversality conditions are superseded by transversality-at-infinity conditions involving the asymptotic behavior of subdifferentials or the value gradient with respect to the initial state (Khlopin, 2017).

Overtaking optimality is shown to select the unique admissible (saddle-path) solution in canonical economic growth models, such as the zero-discount Ramsey problem (1512.01206).

4. Overtaking in Linear-Quadratic and Stochastic Control Problems

In infinite-horizon linear-quadratic (LQ) regulation with possibly non-integrable disturbances and no stabilizability assumptions, overtaking optimality provides weaker, but often necessary and sufficient, solution concepts (Huang et al., 2020, Li et al., 2024):

In the absence of finite-cost (strong) optimal controls, overtaking optimality still allows for the existence and explicit construction of controls whose truncated costs eventually outperform all alternatives (Li et al., 2024, Huang et al., 2020).
Characterization arises via infinite-horizon Fredholm or Riccati equations, but with tail-term adjustments handling the divergent aspects of the problem.
Conditions for nonexistence and variational inequalities are derived; overtaking optimal policies may not exist unless additional boundedness, integrability, or convexity constraints are imposed (Huang et al., 2020).

In online control with unknown disturbances, overtaking optimality underpins feedback controller design that achieves the best possible infinite-horizon transient performance (overtaking sense), and recovers steady-state optimality even when the limit cost diverges (Li et al., 2024).

5. Overtaking Optimality in Planning, Traffic, and Autonomous Systems

a) Autonomous Vehicle Overtaking and Trajectory Planning:

Data-driven overtaking planners (e.g., Predictive Spliner (Baumann et al., 2024), M-Predictive Spliner (Imholz et al., 19 Jun 2025), FSDP (Hu et al., 8 Mar 2025)) use Gaussian process prediction of opponent motion to define and solve finite-horizon optimal control or trajectory optimization problems. Performance metrics include overtaking-success rate $\mathcal{R}_{ot/c}$ , maximum speed ratios, safety margins, and smoothness (e.g., jerk, steering-rate). Overtaking optimality here refers to planners' ability to generate maneuvers that, with high probability and over repeated trials, avoid collisions and maintain minimum-time properties in the long run.

Sequential Quadratic Programming (SQP), bi-level QPs, and predictive MPC frameworks all instantiate overtaking-optimal planning by anticipating opponent trajectories, solving minimum-time or minimum-cost (including safety) problems over regions of collision.
Empirical results indicate that learning-based predictive planners significantly increase overtaking-success rates and achievable speed scalers, approximating global overtaking optimality in practice (Baumann et al., 2024, Hu et al., 8 Mar 2025, Imholz et al., 19 Jun 2025).

b) Rule-Adhering Overtake Planning:

Minimum-violation planning schemes cast overtaking as a lexicographically ordered multi-objective problem, prioritizing rule adherence (e.g., collision avoidance, stay-in-lane) over time minimization. A trajectory is overtaking-optimal in this sense if it minimizes (in order of priority) weighted violation measures of temporal logic rules, then time (Wongpiromsarn et al., 2020).

c) Traffic Flow and Multi-Agent Effects:

Cellular automata traffic models such as the Nagel-Schreckenberg (NS) model augmented with overtaking strategies (NSOS) identify regimes where overtaking improves or harms collective throughput. The presence of a social dilemma—where each vehicle's optimal strategy is overtaking, but collective welfare is reduced at critical densities—demonstrates the complex implications of overtaking optimality in decentralized systems (Simão et al., 2020, Su et al., 2015).

6. Game-Theoretic and Multi-Agent Overtaking

In multi-agent racing and traffic, overtaking optimality is intertwined with equilibria in dynamic games. Stackelberg and zero-sum frameworks operationalize overtaking criteria as equilibrium selection rules: an overtaking-optimal policy is one that is not overtaken by any best response of an adversary (Ashkenazi-Golan et al., 2019, Fork et al., 2022). Advanced vehicle models (e.g., nonplanar racetrack, full two-track dynamics) expand the solution envelope, admitting overtaking trajectories not captured by simpler models (Fork et al., 2022).

7. Limitations, Nonexistence, and Structural Results

Strong overtaking optimality may be too selective, failing to admit admissible policies even in simple finite MDPs or control problems (Jonsson, 2017).
Existence theorems depend crucially on system dynamics, integrability of data, and convexity of admissible sets (Huang et al., 2020, Jonsson, 2017).
Transversality and overtaking criteria provide powerful, but not always sufficient, tools—additional regularity, monotonicity, or boundedness is often needed to ensure uniqueness of solutions and prevent pathological behavior (Khlopin, 2017, Khlopin, 2012).

References

Axiomatic and existence theory for MDPs: (Jonsson, 2017, Ashkenazi-Golan et al., 2019)
Infinite-horizon and LQ control: (1512.01206, Khlopin, 2017, Khlopin, 2012, Huang et al., 2020, Belyakov, 2019, Li et al., 2024)
Overtaking in trajectory planning and autonomous vehicles: (Brüdigam et al., 2021, Palatti et al., 2021, Baumann et al., 2024, Hu et al., 8 Mar 2025, Imholz et al., 19 Jun 2025, Fork et al., 2022, Wongpiromsarn et al., 2020)
Traffic, social dilemmas, and multi-agent settings: (Simão et al., 2020, Su et al., 2015, Dong et al., 2023)

These works encompass the spectrum from abstract optimality criteria in stochastic and deterministic systems, to rigorous necessary and sufficient conditions in infinite-horizon control, and on to algorithmic realizations in autonomous systems and traffic engineering. Overtaking optimality thus unifies a range of models where eventual performance, rather than pointwise maximization of possibly unbounded objectives, determines the truly optimal strategy.