Instance-Wise Minimax Reformulations

Updated 9 December 2025

Instance-wise minimax reformulations are robust optimization frameworks that tailor performance guarantees to individual data instances while ensuring worst-case minimax bounds.
They utilize LP-based strategies, robust estimation methods, and blockwise testing to bridge the gap between instance-optimal and adversarial regimes.
Applications include bandit algorithms, robust AUC optimization, and heavy-tailed regression, offering practical convergence and error guarantees in complex environments.

Instance-wise minimax reformulations refer to optimization frameworks and algorithms in statistical decision theory, empirical risk minimization, bandit learning, and robust statistics where both the performance guarantees and objective constructions are explicitly tailored to the individual instance at hand (often the data distribution, environment, or parameter), but simultaneously achieve worst-case minimax-optimality in adversarial or hardest cases. This paradigm enables algorithms and estimators to interpolate gracefully between instance-optimal and minimax regimes, balancing adaptivity to benign environments with robustness to adversarial conditions, and is foundational in modern approaches to bandit algorithms, robust AUC optimization, heavy-tailed regression, and general minimax convex-linear learning problems.

1. Formulation and Definitions

Instance-wise minimax reformulation typically refers to the design of an optimization problem and corresponding algorithm whose regret or estimation error both (a) nearly matches an information-theoretic lower bound for the specific “instance” (e.g., given $\theta^*$ , arm set $X$ , or loss structure), and (b) in the worst-case environment is minimax-optimal. The essential problem setting is to minimize or estimate under uncertainty about data, reward, or adversary, with key benchmarks:

Instance-optimality: Achieve regret $o(T^\epsilon)$ for all $\epsilon>0$ in stochastic or corrupted cases, with scaling determined by an explicit instance-dependent constant $c(X,\theta)$ or similar.
Minimax-optimality: Achieve the worst-case lower bounds (e.g., $O(d\sqrt{T})$ in adversarial bandits, $O(\epsilon^{-3})$ convergence in min-max optimization).
Reformulation: Express the learning or estimation task as a min-max (or saddle-point) problem whose structure reduces to an instance-wise LP, convex-linear, or strongly-concave minimax problem with explicit constraints and estimators.

For example, in linear bandits (Lee et al., 2021), for arms $X\subset\mathbb{R}^d$ and unknown loss $\theta$ , the regret is measured against the unique best arm $x^*$ , with instance-dependent gap $\Delta_x$ , and the minimax parameter $c(X,\theta)$ defined by:

$c(X,\theta) = \inf_{N_x\ge 0} \sum_{x\neq x^*} N_x\Delta_x \quad \text{s.t.} \quad \|x\|_{(\sum_{z} N_z zz^\top)^{-1}}^2 \le \frac{\Delta_x^2}{2} \ \forall x\neq x^*$

Algorithms aim to match this lower-bound for the individual instance, while retaining $O(d\sqrt{T})$ regret in adversarial cases.

2. Algorithmic Strategies and Robust Estimation

Instance-wise minimax reformulation involves specialized estimation and testing mechanisms. Key steps include:

Distributional LPs: Replace deterministic sample allocations by randomized distributions over actions (arms, examples), matching the lower-bound LP solution via a distribution $p^*$ (Lee et al., 2021). This controls sample variance and delivers unbiased estimators.
Robust estimators: Use Catoni-style robust means, adaptive Huber regression, or other M-estimators to handle heavy-tailed or corrupted rewards, ensuring high-probability confidence intervals even under limited moment assumptions (Huang et al., 2023).
Block-wise testing and phase switching: Structure learning into epochs, monitor deviations, employ statistical testing to transition between adversarial and instance-optimal regimes, and adapt block lengths as dictated by observed non-stochasticity.
Minimax optimization: Reformulate objectives as regularized or penalized nonconvex × strongly-concave minimax problems; e.g., with variables $(w,a,b,\gamma)$ for AUC optimization, or $(x,y)$ plus dual multipliers $\lambda$ for coupled constraints (Hu et al., 2024).

Block structure and variance reduction (e.g., momentum averaging, semi-bandit combinatorics, variance-reduced gradients) are recurrent themes for scalable implementation and high-probability convergence.

3. Key Regret and Error Guarantees

Instance-wise minimax reformulation yields precise performance bounds:

Setting	Instance-Wise Bound	Minimax Bound	Reference
Stochastic/corrupted linear bandits	$O(c(X,\theta)\ln^2 T + C)$	$O(d\sqrt{T})$	(Lee et al., 2021)
Logistic bandits (OFULog)	$O(d\sqrt{T/\kappa^*}) + O(d^3)$	$\tilde{O}(d\sqrt{T})$	(Abeille et al., 2020)
Heavy-tailed linear/RL	$\tilde{O}(d T^{\frac{1-\epsilon}{2(1+\epsilon)}} \sqrt{\sum_t \nu_t^2})$	$\Omega(d T^{1/(1+\epsilon)})$	(Huang et al., 2023)
Empirical minimax learning	$\Delta(\bar{w},\bar{a}) \lesssim O(\sqrt{kn/T})$	--	(Roux et al., 2021)
Minimax PAUC opt. (ASGDA)	$O(n\epsilon^{-3})$ iteration, $\tilde{O}(\alpha^{-1}n_+^{-1}+\beta^{-1}n_-^{-1})$ gen. error	--	(Shao et al., 2022, Jiang et al., 1 Dec 2025)

These bounds demonstrate simultaneous adaptivity: for benign or deterministic cases, instance-optimal rates dominate; for hardest cases (adversarial, heavy-tailed), worst-case minimax rates are matched.

4. Representative Problem Domains and Extensions

Instance-wise minimax reformulation has been developed in:

Bandits: Linear, logistic, and dueling bandits leverage instance-dependent LPs, robust (Catoni) means, and blockwise testers (Lee et al., 2021, Abeille et al., 2020, Lee et al., 3 Jun 2025).
Empirical minimax learning: Distributionally robust objectives (max-loss, average top- $k$ loss, CVaR) with efficient online bandit/online learner decomposition and sparsity/injectivity constraints for the simplex (Roux et al., 2021).
Partial AUC optimization: Direct instance-wise minimax rewriting of OPAUC/TPAUC as nonconvex × strongly-concave problems, sorting-free surrogates, and smooth/unbiased variants (Shao et al., 2022, Jiang et al., 1 Dec 2025).
Heavy-tailed statistics: Adaptive Huber regression, self-normalized concentration to address arbitrariness and yield instance-dependent rates in RL and regression (Huang et al., 2023).
Nonconvex-concave minimax optimization: General reduction to unconstrained or penalized minimization (MMPen), PFBE envelopes, and Clarke-stationary equivalence (Hu et al., 2024).

Potential extensions include structured combinatorial constraints, broader function classes, and applications to high-dimensional semibandit settings.

5. Theoretical Foundations and Proof Techniques

Instance-wise minimax guarantees are grounded in convex analysis, duality, self-concordance, robust concentration inequalities, and information-theoretic packing. Typical arguments:

Derive instance-dependent LP or saddle-point reformulation matching lower bounds via distributional sampling.
Prove robust estimation via Catoni/Huber methods, ensuring deviations scale with per-instance moments or design covariances.
Partition error into “permanent” (instance-optimal) and “transitory” (worst-case) terms, and demonstrate necessity via matching lower bounds (e.g., packing or KL-divergence arguments).
Establish equivalence of minimax/Coupled Constraint problem stationary points to unconstrained or penalized minimization via envelope constructions and Clarke subdifferential analysis (Hu et al., 2024).
Optimize complexity via variance reduction, blockwise scheduling, and sparsity/injectivity in combinatorial bandit actions.

These techniques offer sharp interpolation between instances and worst-case environments.

6. Computational and Generalization Properties

Algorithms arising from instance-wise minimax reformulations typically offer:

Linear per-iteration complexity: Sorting-free surrogates and instance-wise variable reduction yield $O(n)$ or $O(n_+^B + n_-^B)$ batch costs (Jiang et al., 1 Dec 2025).
Accelerated convergence rates: ASGDA, blockwise momentum, and penalized minimization yield $O(\epsilon^{-3})$ or $O(1/\epsilon^2)$ rates for typical minimax problems under strong concavity (Shao et al., 2022, Jiang et al., 1 Dec 2025, Hu et al., 2024).
Generalization bounds: Instance-wise minimax losses enable direct application of local Rademacher complexity and avoid cumbersome pairwise covering-number analyses, with error scaling as $\tilde{O}(\alpha^{-1}n_+^{-1}+\beta^{-1}n_-^{-1})$ for PAUC (Jiang et al., 1 Dec 2025).
Empirical performance: Demonstrated gains over competing methods in CIFAR, Tiny-ImageNet, and synthetic resource-allocation problems, with notably improved speed and stationarity versus standard GDA solvers.

A plausible implication is that continued algorithmic refinement—especially for nonconvex minimax setups—may further improve high-probability guarantees and scalability for structured domains.

7. Connections, Limitations, and Future Outlook

Instance-wise minimax reformulation synthesizes robust statistical estimation, online/sequential decision theory, and adversarial machine learning. The framework depends critically on the feasibility and solvability of instance-dependent LPs or convex surrogates, robust estimator constructions, and tractable combinatorics of extremal sets (e.g., sparsity and injectivity).

Current limitations include dependence on explicit structure for the instance simplex/extremal set, requirement of strong-concavity/smoothness in inner problems, and scaling of certain variance-control quantities in high dimensions. Extending this paradigm to more general coupled constraints, broader function approximation classes, and high-dimensional adversarial scenarios remains an area of active research. The fundamental insight is the reconciliation of adaptivity (“best-of-both-worlds”) with worst-case robustness in statistical learning and optimization.