Factorized Robust Signal Recovery

Updated 22 January 2026

Factorized Robust Signal Recovery is an approach that decomposes signals into factors to robustly recover underlying structures even in nonconvex, nonsmooth settings.
The framework employs continuous-time subgradient trajectories and differential inclusions to establish convergence and identify active manifolds in complex optimization landscapes.
It underpins robust algorithms used in applications such as robust PCA, phase retrieval, and neural network training, enabling effective recovery of corrupted signals.

A continuous-time subgradient trajectory is an absolutely continuous curve that evolves according to a differential inclusion driven by the (typically nonconvex, possibly nonsmooth) subdifferential of an objective function. Such trajectories serve as a mathematical bridge between discrete optimization algorithms, such as the subgradient method, and the dynamical systems perspective, offering tools to analyze asymptotic convergence, manifold identification, and pathological behaviors in nonsmooth, nonconvex, or time-dependent settings. The formalism accommodates both static and nonautonomous (time-varying) optimization, tracks critical-point convergence via Lyapunov and Kurdyka–Łojasiewicz (KL) structures, and underpins the performance of first-order methods in largescale signal recovery, machine learning, and variational analysis.

1. Differential Inclusion Formalism and Subdifferentials

Let $f: \mathbb{R}^n \to \mathbb{R}$ be locally Lipschitz. The Clarke subdifferential at $x$ is defined as

$\partial f(x) = \{ s \in \mathbb{R}^n : f^{\circ}(x;d) \geq \langle s, d \rangle \ \forall d \in \mathbb{R}^n \} ,$

where $f^{\circ}(x; d)$ denotes the Clarke directional derivative. A continuous-time subgradient trajectory is an absolutely continuous curve $x(\cdot): [0, \infty) \to \mathbb{R}^n$ satisfying

$x'(t) \in -\partial f(x(t))$

for almost every $t \geq 0$ . This is a differential inclusion generalizing the notion of gradient flow to nonsmooth, possibly nonconvex settings (Josz et al., 2023, Lewis et al., 2022, Cai et al., 15 Jan 2026).

For general maximal monotone operators $A: H \rightrightarrows H$ in Hilbert spaces, the dynamical system

$\dot x(t) + A(x(t)) \ni 0$

arises, with particular interest in the case where $A = \partial \varphi$ for a proper, closed, convex function $\varphi$ (Attouch et al., 2016).

2. Existence, Uniqueness, and Regularity

If $f$ is proper, lower semicontinuous, bounded below, and primal-lower-nice (a generalization covering convex, weakly convex, and strongly amenable composite functions), any initial point $x(0) \in \operatorname{dom} f$ yields a unique, locally absolutely continuous trajectory satisfying the subgradient differential inclusion (Lewis et al., 2022). These trajectories have finite energy, quantified by

$\int_0^{\infty} \|x'(t)\|^2 \, dt < \infty .$

In the convex case, classical results by Brézis and Marcellin–Thibault establish this for general maximal monotone evolutions (Lewis et al., 2022, Attouch et al., 2016).

In time-varying (nonautonomous) settings, for a family of proper, closed, convex functions $\{\varphi_t\}$ , global solutions exist for the nonautonomous inclusion (NAGI)

$\dot x(t) + \partial \varphi_t(x(t)) \ni 0,$

with convergence properties determined by Mosco-type assumptions and energetical inequalities (Attouch et al., 2016).

3. Global Convergence and Lyapunov Structure

For a broad class of objectives—locally Lipschitz, coercive, and definable (tame) in an o-minimal structure (e.g., semialgebraic)—every continuous-time subgradient trajectory converges to a neighborhood of a single connected component of the set of critical points $\{x: 0 \in \partial f(x)\}$ . Specifically:

The objective $f(x(t))$ is nonincreasing and absorbs to a critical value as $t \to \infty$ .
The distance to a connected component $C$ of critical points vanishes: $\operatorname{dist}(x(t), C) \to 0$ .
After some finite time, the trajectory remains in every pre-specified neighborhood of $C$ .

These properties exploit the Kurdyka–Łojasiewicz (KL) inequality, with desingularizing functions $\psi_i$ encoding how the subdifferential norm lower-bounds the function-value gap (Josz et al., 2023, Zhang et al., 2024). Along any such trajectory,

$\frac{d}{dt} f(x(t)) \le -\| \operatorname{Proj}_{\partial f(x(t))}(0) \|^2,$

leading to function values converging monotonically, and, under KL-type conditions, the entire trajectory has finite length: $\int_0^{\infty} \|x'(t)\| \, dt < \infty .$ Discrete-time subgradient methods can be shown to track these flows over finite intervals and inherit their global convergence guarantees under suitable step-size regimes (Josz et al., 2023, Cai et al., 15 Jan 2026, Zhang et al., 2024).

4. Pathological and Exceptional Dynamics

Despite the general convergence guarantees under coercivity, definability, or suitable KL properties, continuous-time subgradient flows may admit pathological behaviors for merely Lipschitz functions (Daniilidis et al., 2019). Explicit examples with Lipschitz $f$ in $\mathbb{R}^4$ exist where:

Trajectories are periodic and remain bounded,
The critical set $\operatorname{Crit}(f)$ (where $0 \in \partial f(x)$ ) is never visited by the trajectory,
The objective $f(x(t))$ is not necessarily monotone along the flow.

The mechanism exploits the freedom within the set-valued Clarke subdifferential to engineer ODE selections whose solutions are classical rotations, avoiding all critical points. These constructions demonstrate that, absent regularity (tameness, convexity, KL conditions), Lipschitz continuity of $f$ is insufficient for subgradient convergence or Lyapunov decrease (Daniilidis et al., 2019). Analogous phenomena can occur in discrete-time subgradient methods with nonsummable step-sizes.

5. Extensions: Nonautonomous, Multiscale, and Proximal-Gradient Flows

Nonautonomous Gradient Systems

For time-dependent convex objectives $\varphi_t$ , convergence of solutions to the minimizer set of the limiting function $\varphi_\infty$ requires Mosco-type conditions and control of the Brézis–Haraux excess function

$G_{\partial \varphi_t}(z, 0) = \sup_{(y, v) \in \operatorname{gph} \partial \varphi_t} \langle z - y, v \rangle .$

Appropriate summability conditions yield weak convergence to the minimizer set, while strong minima enable pointwise convergence (Attouch et al., 2016).

Multiscale and Hierarchical Minimization

Multiscale systems of the form

$\dot x(t) + \partial \Psi(x(t)) + \beta(t) \partial \Phi(x(t)) \ni 0, \qquad \beta(t) \to \infty$

encode hierarchical minimization, selecting minimizers of $\Phi$ over the set $C = \operatorname{argmin} \Psi$ . Viscosity-type inclusions select regularized (e.g., minimal-norm) solutions (Attouch et al., 2016).

Proximal-Gradient Dynamics

For composite objectives $F(x) = g(x) + h(x)$ , where $g$ is differentiable convex and $h$ is convex (possibly nonsmooth), proximal-gradient flows are captured by

$\dot x(t) = -x(t) + \mathrm{prox}_{\alpha h}(x(t) - \alpha \nabla g(x(t))).$

In the limit $\alpha \to 0$ , this recovers the subgradient flow $\dot x \in -\partial F(x)$ . For $F$ satisfying a proximal-PL or KL-type inequality,

$\min_{s \in \partial F(x)} \|s\|^2 \ge 2 \mu (F(x) - F^*)$

one obtains exponential convergence of $F(x(t))$ to the minimal value (Gokhale et al., 2024).

6. Identifiability, Partial Smoothness, and Manifold Identification

Identifiability and partial smoothness predict that subgradient trajectories may, after finite time, be confined to a lower-dimensional manifold (active set) on which the objective is smooth. Under metric identifiability (no vanishing subgradients when approaching from outside the manifold), and regularity hypotheses (primal-lower-niceness, local smoothness), subgradient flows are shown to enter the manifold in finite time and subsequently evolve according to projected smooth dynamics (Lewis et al., 2022). For example, for

$f(x_1, x_2) = 5 |x_2 - x_1^2| + x_1^2 ,$

all subgradient curves converge to $(0,0)$ , but after some finite time, trajectories enter the parabola $\mathcal{M} = \{(x_1, x_2) : x_2 = x_1^2\}$ and continue as smooth flows.

7. Applications and Impact on Algorithmic Analysis

Continuous-time subgradient trajectories underpin the analysis of convergence and avoidance of spurious critical points for discrete-time first-order methods applied to robust PCA, phase retrieval, decentralized stochastic optimization, and neural network training (Zhang et al., 2024, Cai et al., 15 Jan 2026, Gokhale et al., 2024). For semialgebraic and tame objectives, the boundedness and finite-length properties of subgradient flows ensure that for sufficiently small step-sizes, discrete iterates are trapped in neighborhoods of critical sets and converge globally, even in nonconvex, nonsmooth scenarios (Josz et al., 2023, Cai et al., 15 Jan 2026). Exceptions and cautionary examples limit this paradigm to settings with sufficient regularity, as evidenced by pathological constructions (Daniilidis et al., 2019).

The continuous-time perspective is thus a unifying analytic tool throughout modern nonsmooth optimization, geometrically informed analysis, and decentralized consensus, enabling precise statements about global dynamics, invariant sets, convergence rates, and the structure of limiting behaviors across a wide range of mathematical and algorithmic contexts.