Minimax-Optimal Halpern Scheme

Updated 29 January 2026

The scheme establishes fixed-point convergence through a classical Halpern anchoring method that attains an O(1/N^2) worst-case rate.
It employs algebraic invariant theory (H-invariance) to characterize algorithms that achieve minimax optimality for nonexpansive and Lipschitz operators.
Adaptive variants and splitting schemes extend its applicability to monotone inclusions, convex-concave minimax problems, and saddle-point optimization.

The minimax-optimal Halpern scheme refers to a class of fixed-point iterative algorithms for nonexpansive or Lipschitz operators that provably attain the optimal worst-case convergence rate for driving the fixed-point residual to zero. This paradigm generalizes Halpern’s classical anchoring approach, and exact characterizations exist both for nonexpansive maps in Hilbert/normed spaces and for operators relevant to monotone inclusions, variational inequalities, and convex-concave minimax optimization. The minimax-optimality is established via tight non-asymptotic bounds, algebraic invariant theory (H-invariance), and lower bounds showing that no first-order scheme can improve upon the obtained rates under black-box oracle models.

1. Halpern Iteration Fundamentals and Minimax Rate

Halpern’s iteration for finding $x^* = T(x^*)$ with $T$ nonexpansive ( $\|T x - T y\| \le \|x-y\|$ ) in a normed space is anchored to an initial point $x^0$ and uses a sequence of weights $(\alpha_k)$ : $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ The optimal choice for minimax convergence in the nonexpansive ( $\rho = 1$ ) case is $\alpha_{k+1} = \frac{1}{k+2}$ , yielding

$\|x^{N} - T x^{N}\|^2 \le \frac{4 \|x^0 - x^*\|^2}{N^2}$

and no deterministic first-order method can improve on the $O(1/N^2)$ bound in this setting (Yoon et al., 18 Nov 2025).

In the general Lipschitz regime ( $T$ 0, $T$ 1), the minimax-optimal sequence $T$ 2 results from recursively minimizing a tight, quadratic upper bound on $T$ 3: $T$ 4 with the recursion for $T$ 5: $T$ 6 and residual (for $T$ 7): $T$ 8 (Bravo et al., 22 Jan 2026). This recursion is tight: for each $T$ 9, there exists a $\|T x - T y\| \le \|x-y\|$ 0-Lipschitz $\|T x - T y\| \le \|x-y\|$ 1 and initialization for which equality holds.

2. H-Invariance Theory and Complete Characterization

The exhaustive algebraic theory of minimax acceleration for nonexpansive fixed-point algorithms is provided via H-invariance (Yoon et al., 18 Nov 2025). Any algorithm admitting the lower-triangular "moment-mixing" form

$\|T x - T y\| \le \|x-y\|$ 2

is described by its H-matrix. The family of minimax-optimal algorithms is precisely those with H-invariants: $\|T x - T y\| \le \|x-y\|$ 3 and nonnegative H-certificates $\|T x - T y\| \le \|x-y\|$ 4 (unique solution to a linear-quadratic identity). Only methods with these invariants and certificates attain the guaranteed $\|T x - T y\| \le \|x-y\|$ 5 rate. The classical Halpern method (OHM) and its H-dual are extremal cases.

3. Behavior under Contractive and Expansive Operators

For $\|T x - T y\| \le \|x-y\|$ 6 (contractions), the minimax-optimal Halpern sequence transitions from a sublinear Halpern phase to geometric Banach–Picard iteration, as $\|T x - T y\| \le \|x-y\|$ 7 reaches $\|T x - T y\| \le \|x-y\|$ 8 and stays there (from the index $\|T x - T y\| \le \|x-y\|$ 9 for which $x^0$ 0). This yields rapid geometric decay $x^0$ 1. As $x^0$ 2, the Halpern phase length diverges and recovers the $x^0$ 3 rate for nonexpansive maps.

For $x^0$ 4 (expansive), the sequence $x^0$ 5 remains strictly less than $x^0$ 6, and the residual converges to $x^0$ 7, matching the minimal displacement on bounded domains. The minimax scheme is purely Halpern throughout.

4. Potential-Based Analysis and Lower Bounds

The minimax-optimal rates are validated through tight potential-based analysis (Diakonikolas, 2020). For monotone inclusions $x^0$ 8 (with cocoercive $x^0$ 9), setting step-weight $(\alpha_k)$ 0 and backtracking on the operator constant yields

$(\alpha_k)$ 1

The "potential" function is quadratic-minus-linear in the operator evaluations and telescopes over the run.

Lower bounds are established via reductions: for any first-order method, problems exist for which $(\alpha_k)$ 2 is unimprovable (e.g., convex-concave saddle problems in $(\alpha_k)$ 3 dimensions) (Diakonikolas, 2020). The Halpern scheme matches these bounds up to polylogarithmic factors and is parameter-free.

5. Splitting Schemes and Extensions

Halpern-type minimax algorithms have been extended to monotone splitting and saddle-point settings (Tran-Dinh et al., 2021). For $(\alpha_k)$ 4 (maximally monotone $(\alpha_k)$ 5, Lipschitz $(\alpha_k)$ 6), Halpern–Popov variants give

$(\alpha_k)$ 7

where $(\alpha_k)$ 8 is the forward–backward residual. Two splitting schemes are constructed:

Extra-Anchored Gradient-Splitting (EAG-Split): two resolvent calls per iteration.
Past-Extra-Anchored Gradient-Splitting (PEAG-Split): only one computation each of $(\alpha_k)$ 9, $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 0, and $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 1.

Application to convex–concave minimax problems ( $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 2) leverages the Halpern scheme for the skew-gradient optimality operator, establishing $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 3 convergence of the gradient norm, matching known lower bounds.

Adaptive Halpern schemes, inspired by minimax theoretical bounds, track empirical residuals $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 4 to compute step sizes $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 5 that can outperform the non-adaptive minimax bound in practical settings (Bravo et al., 22 Jan 2026). The update is

$x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 6

with $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 7. This process is universally better or equal compared to the minimax rates.

Extensions further cover unbounded domains (via displacement bounds based on fixed-point distance), affine maps (closed-form for optimal $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 8), and strongly monotone operators (with restarted Halpern strategies).

7. Comparison to Alternative Acceleration and Practical Implications

Nesterov’s acceleration for monotone inclusions yields $x^{k+1} = \alpha_{k+1} x^0 + (1 - \alpha_{k+1}) T x^k$ 9 last-iterate rates unless extra structure (e.g., cocoercivity) is present. Anchored Extragradient schemes match the Halpern $\rho = 1$ 0 rate but require twice as many operator calls. The minimax-optimal Halpern schemes uniquely combine single oracle usage, explicit algebraic invariant theory, and optimality under black-box oracle models and mere monotonicity/Lipschitz assumptions (Tran-Dinh et al., 2021, Yoon et al., 18 Nov 2025, Bravo et al., 22 Jan 2026).

Halpern-type minimax schemes unify approaches for splitting, inclusion, variational inequality, and convex–concave min–max, always matching fundamental lower bounds up to log-factors. The certificate-based H-invariance characterization catalogues every possible optimal algorithm in this class. These results constitute the canonical foundation for minimax-optimal first-order fixed-point algorithms.