Inertial Tseng Extragradient Method

Updated 22 January 2026

Inertial Tseng Extragradient Method is a first-order splitting algorithm that integrates momentum-based extrapolation to solve monotone inclusion and equilibrium problems.
The method incorporates inertial extrapolation, forward-backward correction, and extragradient updates to ensure robust weak, strong, and even linear convergence.
It enhances practical performance in large-scale, ill-conditioned scenarios by employing adaptive parameter selection and reducing iteration counts.

The inertial Tseng extragradient method is a class of first-order splitting algorithms for monotone inclusion, variational inequality, or equilibrium problems that incorporates inertial (momentum-type) extrapolation into the classical Tseng forward–backward–forward (extragradient) scheme. These methods, typically implemented in real Hilbert spaces, can accommodate both single-valued and set-valued maximally monotone operators, pseudomonotone and quasimonotone maps, and may be adapted to composite, multi-valued, and structured problems. Inertial Tseng methods are motivated by the desire to accelerate convergence and improve practical performance—especially in large-scale and ill-conditioned problems—while preserving rigorous theoretical guarantees such as weak/strong convergence and nonasymptotic complexity rates. This entry presents the foundational problem setups, algorithmic principles, convergence results, parameter selection, and practical applications, as documented in recent literature.

1. Foundational Problem Structures

Inertial Tseng extragradient methods address structured monotone inclusion problems of the form:

$0 \in A x^* + B x^*,$

where $A: H \to H$ is single-valued, monotone, and typically $L$ -Lipschitz continuous; $B: H \to 2^H$ is maximally monotone, possibly set-valued. The solution set is denoted as $\Omega = (A+B)^{-1}(0)$ , assumed nonempty. This generalizes variational inequalities, fixed-point problems, and equilibrium problems, with extensions covering pseudomonotone or quasimonotone operators, nonconvex objectives, and (multi-)valued mappings (Wang et al., 2022, Bot et al., 2014, Tan et al., 2020, Fang et al., 2020, 2611.18642).

For variational inequalities, the setting is:

$\mathrm{find}\;\; x^* \in C \;\; \text{such that} \;\; \left\langle F(x^*),\, y - x^* \right\rangle \geq 0,\;\; \forall y \in C,$

with $C$ closed and convex, and $F$ monotone, pseudomonotone, or quasimonotone.

In equilibrium problems, one works with bifunctions $f: C \times C \rightarrow \mathbb{R}$ such that $f(z,z)=0$ , seeking $A: H \to H$ 0 with $A: H \to H$ 1 for all $A: H \to H$ 2, again under generalized monotonicity and (sub)continuity assumptions (Nwakpa et al., 23 Nov 2025, Fang et al., 2020).

2. Algorithmic Principles and Core Iterative Schemes

A prototypical inertial Tseng method at iteration $A: H \to H$ 3 extrapolates from previous iterates to gain momentum:

Inertial Extrapolation (possibly multi-step):
- $A: H \to H$ 4 (single-step inertia)
- $A: H \to H$ 5 (double-step/triple-step inertia) (Wang et al., 2022, Peng et al., 15 Jan 2026)
(Forward) Backward (Tseng) Step:
- $A: H \to H$ 6
Tseng Extragradient Correction:
- $A: H \to H$ 7
Relaxation/averaged update and possible correction:
- $A: H \to H$ 8 for a secondary inertia $A: H \to H$ 9 (Wang et al., 2022)
- More generally, under-relaxation or averaging with other correction mappings.

Parameter selection follows explicit bounds to maintain stability, e.g., $L$ 0, step-sizes $L$ 1 via self-adaptive or Armijo-type rules, and relaxation parameters $L$ 2 within prescribed intervals (Bot et al., 2014, Wang et al., 2022, Alves et al., 2018, Peng et al., 15 Jan 2026).

In set-valued and equilibrium extensions, the forward-backward-forward architecture persists, but explicit subproblems (proximal or variational substeps) and halfspace projections are employed (Nwakpa et al., 23 Nov 2025, Fang et al., 2020).

3. Convergence Properties and Complexity Rates

The key theoretical advances provided by inertial Tseng methodologies include:

Weak Convergence: Under standard monotonicity and Lipschitz assumptions, the sequence $L$ 3 (possibly along with auxiliary inertial iterates) converges weakly to a solution, typically established via Lyapunov-type Fejér monotonicity and the Opial lemma (Wang et al., 2022, Bot et al., 2014, Alves et al., 2018, Bot et al., 2014).
Strong Convergence: Imposing additional strong monotonicity on $L$ 4 or $L$ 5 (or strong pseudomonotonicity in equilibrium/VI settings), one obtains norm convergence to the unique solution (Wang et al., 2022, Tan et al., 2020, Tan et al., 2020, Nwakpa et al., 23 Nov 2025, Fang et al., 2020).
Linear Convergence: When either $L$ 6 or $L$ 7 is strongly monotone (or the bifunction is strongly pseudomonotone), and the parameters satisfy further contraction conditions, $L$ 8 converges Q-linearly to the solution (Wang et al., 2022, Nwakpa et al., 23 Nov 2025).
Nonasymptotic Rates: For general monotone problems, a $L$ 9 pointwise rate for the operator residual is established; $B: H \to 2^H$ 0 ergodic rates hold for averages. For strongly monotone cases, expectation rates of $B: H \to 2^H$ 1 in stochastic variants, and even $B: H \to 2^H$ 2 for saddle-point gaps, are proved (Wang et al., 2022, Nguyen et al., 2022, Alves et al., 2018).
Nonconvex Settings: If the global objective or Lyapunov function satisfies the Kurdyka–Łojasiewicz (KL) property, strong convergence to limiting critical points holds even in absence of convexity (Bot et al., 2014).
Quasimonotone and Multi-valued VI/EQ: The two-step inertial Tseng with a self-adaptive and Armijo-like step-size eliminates the need for a global Lipschitz bound on $B: H \to 2^H$ 3, and achieves weak convergence for (quasi-)monotone VIs, extending applicability (Peng et al., 15 Jan 2026, Fang et al., 2020).

4. Parameter Selection, Step-Size Adaptivity, and Stabilization

Careful selection of inertia coefficients and step-size is critical:

Parameter	Condition/Update	Effect/Role
$B: H \to 2^H$ 4	$B: H \to 2^H$ 5; explicit upper bounds	Controls magnitude of inertial extrapolation
$B: H \to 2^H$ 6	Self-adaptive: $B: H \to 2^H$ 7, or Armijo/backtracking	Ensures robustness without global Lipschitz
$B: H \to 2^H$ 8	$B: H \to 2^H$ 9	Adjusts relaxation/averaging, stability/acceleration
$\Omega = (A+B)^{-1}(0)$ 0	Decay: $\Omega = (A+B)^{-1}(0)$ 1	Stability in self-adaptive rules

For stochastic and inexact settings, step-sizes and inertia are further regulated by error summability and martingale conditions to maintain almost-sure and expected convergence (Nguyen et al., 2022, Alves et al., 2018).

Double- or multi-step inertia (inclusion of $\Omega = (A+B)^{-1}(0)$ 2 terms) can empirically accelerate convergence at the cost of tighter stability control and parameter tuning (Wang et al., 2022, Peng et al., 15 Jan 2026).

5. Structural and Practical Algorithmic Enhancements

Inertial Tseng-type schemes can be adapted and generalized in multiple directions:

One-projection Schemes: Several recent algorithms reduce projection complexity by requiring only a single projection per iteration (as opposed to two in classical Korpelevich schemes), lowering per-iteration cost (Tan et al., 2020, Tan et al., 2020, Fang et al., 2020).
Step-Size Independence from Global $\Omega = (A+B)^{-1}(0)$ 3: Adaptive, local, or backtracking step-size rules circumvent the need for a known global Lipschitz constant, making these methods applicable to problems where such constants are elusive or conservative (Peng et al., 15 Jan 2026, Tan et al., 2020, Tan et al., 2020, Wang et al., 2022).
Non-Euclidean Geometry: Replacement of Euclidean projections by Bregman projections for non-Euclidean or mirror-descent frameworks (Bot et al., 2014, Fang et al., 2020).
Stochastic and Inexact Variants: Accommodate oracle noise, inexact operator evaluations, and relative error via stochastic approximation or hybrid proximal-extragradient approaches (Nguyen et al., 2022, Alves et al., 2018).
Equilibrium and Multi-Valued Extensions: Pseudomonotone, quasimonotone, and multi-valued mappings, as in equilibrium theory and generalized (sub)gradient settings, are handled via suitable proximal or correction subroutines (Fang et al., 2020, Nwakpa et al., 23 Nov 2025).

6. Numerical Performance and Practical Impact

Extensive computational experiments have been reported:

Compressive Sensing (LASSO): Incompressive sampling, the double-inertial Tseng method achieves lower iteration count and runtime compared to relaxed/inertial Tseng and alternative acceleration schemes (Wang et al., 2022).
Large-Scale Variational Inequalities: With random monotone operators, inertial Tseng-type methods outperform classical/non-inertial extragradient algorithms in both high and infinite dimension settings (Wang et al., 2022, Tan et al., 2020, Tan et al., 2020, Fang et al., 2020).
Pseudomonotone/Quasimonotone Operators: Fast and robust performance in generalized monotonicity settings, where classical algorithms may stagnate or fail (Tan et al., 2020, Fang et al., 2020, Peng et al., 15 Jan 2026, Nwakpa et al., 23 Nov 2025).
Optimal Control: Application to discretized control problems with box/range constraints, recovering highly nontrivial optimal profiles rapidly (Tan et al., 2020).

Empirical results consistently indicate that the inclusion of inertia—especially double/multi-step versions and adaptive step-size control—reduces iteration count, wall-clock time, and increases methodology robustness.

7. Theoretical Significance and Methodological Positioning

The inertial Tseng extragradient paradigm unifies and generalizes several important operator splitting methods, including forward-backward, hybrid proximal-extragradient, and Korpelevich extragradient methods. The fundamental theoretical strengths include:

Global Weak/Strong/Linear Rates: Rigorous convergence under minimal assumptions, competitive with the best-in-class splitting algorithms (Wang et al., 2022, Alves et al., 2018, Nwakpa et al., 23 Nov 2025).
Parameter Independence from Problem Constants: Step-size rules that avoid global operator norm estimation enhance practical implementation viability.
Extendability: The architecture accommodates stochastic/inexact oracles, multi-valued operators, and can be adapted for both convex and nonconvex models.
Versatility: Empirically and theoretically validated across disparate domains (signal processing, optimization, PDE-constrained problems, control).

Contemporary developments focus on further acceleration by multi-step inertial terms, increased adaptivity, extension to non-Euclidean geometry and Banach spaces, and removal of restrictive monotonicity/Lipschitz hypotheses (Peng et al., 15 Jan 2026, Bot et al., 2014, Nguyen et al., 2022). The method remains an active area for both theoretical investigation and algorithmic development in large-scale, structured, and non-smooth optimization.