Continuous-Time Subgradient Trajectories

Updated 22 January 2026

Continuous-time subgradient trajectories are absolutely continuous curves that solve differential inclusions defined by the subdifferential of an objective function.
They generalize classical gradient methods to nonsmooth and composite settings, ensuring convergence under conditions such as the KL property and Lyapunov descent.
Their analysis connects discrete algorithms with continuous dynamics, offering insights into robust optimization applications including robust PCA, phase retrieval, and hierarchical minimization.

A continuous-time subgradient trajectory is an absolutely continuous curve in a finite- or infinite-dimensional space that solves a differential inclusion involving the (possibly nonsmooth, nonconvex) subdifferential of an objective function. These flows generalize classical gradient dynamics to nonsmooth or composite settings, form the theoretical backbone of modern first-order methods for robust and high-dimensional optimization, and have deep connections to geometric and variational analysis.

1. Definition and Mathematical Framework

Let $f: \mathbb{R}^n \to \mathbb{R}$ be a locally Lipschitz function. The Clarke subdifferential at $x \in \mathbb{R}^n$ is defined as

$\partial f(x) := \{\, s \in \mathbb{R}^n : f^0(x; d) \ge \langle s, d \rangle \ \forall d \in \mathbb{R}^n \,\},$

where the Clarke directional derivative is

$f^0(x; d) := \limsup_{y \to x,\, t \downarrow 0} \frac{f(y + t d) - f(y)}{t}.$

A (continuous-time) subgradient trajectory is any absolutely continuous curve $x: [0, \infty) \to \mathbb{R}^n$ that satisfies the differential inclusion

$\dot{x}(t) \in -\partial f(x(t)), \quad x(0) = x_0,$

for almost every $t \geq 0$ (Josz et al., 2023, Lewis et al., 2022, Cai et al., 15 Jan 2026).

In the convex or Hilbert space setting, with $f$ proper, closed, and convex, the inclusion generalizes to maximal monotone operator flows:

$\dot{x}(t) + \partial f(x(t)) \ni 0$

(Attouch et al., 2016).

The subgradient flow provides a continuous counterpart to discrete-time subgradient methods and supplies the limiting behavior for first-order optimization algorithms under vanishing step sizes (Cai et al., 15 Jan 2026, Zhang et al., 2024).

2. Existence, Uniqueness, and Regularity

For proper, lower semicontinuous, bounded below, and primal-lower-nice functions, for every initial point $x(0) \in \mathrm{dom}\, f$ , there exists a unique locally absolutely continuous curve $x(\cdot)$ satisfying the continuous-time subgradient inclusion, and the energy dissipation identity holds:

$\int_0^{\infty} \|\dot{x}(t)\|^2 dt < +\infty$

(Lewis et al., 2022).

When $f$ is convex, the existence and uniqueness follow from classical monotone operator theory (Brézis, Rockafellar). For nonconvex, nonsmooth functions, primal-lower-nice or related regularity conditions are required to guarantee well-posed flows (Lewis et al., 2022).

3. Asymptotic Behavior, Stability, and Convergence

a. Lyapunov Descent and Critical Value Trapping

For functions $f$ locally Lipschitz, coercive, and tame (i.e., definable in an o-minimal structure), every subgradient trajectory is characterized by:

$f(x(t))$ nonincreasing and convergent to a critical value $f^*$ as $t \to \infty$ ;
Existence of a connected component $C$ of the critical set $\{ x: 0 \in \partial f(x) \}$ such that $\mathrm{dist}(x(t), C) \to 0$ ;
Once close to $C$ , $x(t)$ remains in every neighborhood of $C$ after some finite time (Josz et al., 2023).

Key inequalities include:

$\frac{d}{dt} f(x(t)) \leq -\|\mathrm{Proj}_{\partial f(x(t))}(0)\|^2,$

and, by the Kurdyka–Łojasiewicz (KL) inequality,

$d(0, \partial f(x)) \geq \frac{1}{\psi'(|f(x)-f_i|)}$

in a neighborhood of each critical level $f_i$ .

Integration yields $\int_0^\infty \|x'(t)\| dt < \infty$ , enforcing trajectory convergence and vanishing velocity at infinity (Josz et al., 2023, Lewis et al., 2022, Cai et al., 15 Jan 2026).

b. KL Inequality and Identifiability

For proper, lower semicontinuous functions satisfying the KL property at a critical point, every subgradient trajectory $x(t)$ converges:

$\int_0^{\infty} \|x'(t)\| dt \leq \varphi(f(x(0)) - \bar f) < +\infty.$

After finite time, the trajectory is confined to an "identifiable" set, e.g., a $C^2$ manifold, and subsequently follows the projected smooth gradient dynamics on this set (Lewis et al., 2022).

c. Nonautonomous and Multiscale Generalizations

In the nonautonomous setting, the inclusion generalizes as

$\dot{x}(t) + \partial \varphi_t(x(t)) \ni 0,$

where $\varphi_t$ evolves in time, possibly non-monotonically. Under Mosco-type or monotonicity assumptions on $\{ \varphi_t \}$ , trajectory convergence toward $S_\infty = \arg\min \varphi_\infty$ is established, where $\varphi_\infty$ is the limit functional (Attouch et al., 2016).

Multiscale variants of the flow,

$\dot{x}(t) + \partial \Psi(x(t)) + \beta(t)\partial\Phi(x(t)) \ni 0, \quad \beta(t) \to \infty,$

reflect hierarchical or viscosity-type minimization and exhibit weak (or in special settings, strong) convergence to solutions of a hierarchical minimization hierarchy (Attouch et al., 2016).

4. Pathological and Nonconvergent Dynamics

Lipschitz continuity of $f$ alone does not guarantee that continuous-time subgradient trajectories approach critical points. Explicit constructions exist where periodic bounded trajectories are confined away from all critical points:

For well-designed $f$ in $\mathbb{R}^4$ , it is possible to select a continuous subgradient vector field such that the associated ODE induces a bounded periodic orbit that remains disjoint from $\operatorname{Crit}(f)$ for all time (Daniilidis et al., 2019).

This demonstrates that neither convexity nor regularity can be dispensed with if one wishes to guarantee approachability of generalized stationary points. This pathology is present in both continuous and discrete time; in the discrete case, the corresponding subgradient iterates may also have non-diverging, non-converging bounded orbits under analogous constructions (Daniilidis et al., 2019).

5. Connection to Discrete Algorithms and Lyapunov Functions

Discrete subgradient algorithms often admit an interpretation as an inexact or time-discretized version of continuous subgradient flows. The key technical step in many analyses is to demonstrate that the discrete sequence remains close to some continuous-time trajectory, and to exploit the Lyapunov structure of the latter.

For instance, for locally Lipschitz, semialgebraic $f$ , the existence of uniform finite-length bounds for trajectories and their KL property enables the proof of convergence for discrete subgradient iterates with prescribed step-size regimes (e.g., $\alpha_k = O(1/k)$ ), provided all continuous subgradient trajectories of $f$ are bounded (Cai et al., 15 Jan 2026).

In decentralized or stochastic settings, one identifies a Lyapunov function $\Psi$ (possibly including consensus and momentum components) that decreases along solutions of the limit dynamics. Under appropriate stability of the Lyapunov level set and boundedness assumptions, one concludes global convergence for all perturbed or consensus-coupled processes whose interpolations approximate the continuous-time behavior (Zhang et al., 2024).

6. Applications and Composite Extensions

Continuous-time subgradient dynamics underlie the analysis and design of algorithms for:

Robust PCA, robust phase retrieval, robust matrix sensing, where the boundedness of subgradient flows is verified via analytic or geometric arguments depending on the subproblem structure (Cai et al., 15 Jan 2026).
Proximal-gradient flows for objectives $F(x) = g(x) + h(x)$ , analyzed via the ODE

$\dot{x}(t) = -x(t) + \mathrm{prox}_{\alpha h}(x(t) - \alpha \nabla g(x(t))),$

which in the limit $\alpha \to 0$ converges to classic subgradient flow. Exponential convergence is warranted under proximal-PL or KL-type conditions (Gokhale et al., 2024).

Hierarchical minimization and selection principles in multiscale and nonautonomous systems (Attouch et al., 2016).

The continuous-time subgradient theory is thus fundamental to understanding the long-term and stability properties of first-order methods, both in convex and in broad classes of structured nonconvex optimization.

Key literature: (Josz et al., 2023, Daniilidis et al., 2019, Lewis et al., 2022, Cai et al., 15 Jan 2026, Zhang et al., 2024, Gokhale et al., 2024, Attouch et al., 2016).