Minimax Lower Bounds: Theory & Applications

Updated 30 January 2026

Minimax lower bounds are rigorous benchmarks that quantify the worst-case risk in estimation problems using tools like Fano's and Le Cam's inequalities.
They are applied to assess optimality in various settings including parametric, nonparametric, high-dimensional, and distributed estimation frameworks.
Modern methodologies extend these bounds to irregular and non-smooth models by incorporating divergence measures and generalized loss functions.

A minimax lower bound characterizes the fundamental difficulty of a statistical decision or estimation problem by quantifying the smallest possible worst-case risk over a class of procedures and underlying models. Rigorous minimax lower bounds establish the best achievable rate (and sometimes the sharp constant) for any estimator or algorithm, over prescribed parameter spaces and loss functions, and are a cornerstone of information-theoretic and statistical complexity results. They serve as benchmarks for assessing and designing optimal methods in parametric, nonparametric, high-dimensional, and interactive settings, under a wide range of constraints and loss regimes.

1. General Framework and Foundational Results

The minimax risk for a statistical problem with parameter space $\Theta$ , data-generating measures $\{P_\theta: \theta \in \Theta\}$ , estimator $\widehat\theta$ , and loss $L(\theta, \widehat\theta)$ is

$R^* = \inf_{\widehat\theta} \sup_{\theta \in \Theta} \mathbb{E}_{\theta}\left[ L(\theta, \widehat\theta) \right].$

Classical lower bounds—such as Fano's, Le Cam's, and Assouad's—reduce minimax estimation to hypothesis testing or packing arguments, using tools including $f$ -divergences (KL, $\chi^2$ , TV), metric entropy, and information-theoretic covering or packing numbers (Guntuboyina, 2010, Chen et al., 2014). For instance, Guntuboyina's $f$ -divergence framework extends and unifies many well-known results:

Fano’s inequality provides a lower bound via mutual information (KL-divergence) and entropy covering.
Le Cam two-point or multiple-point methods relate estimation risk to testing hard pairs or packs of alternatives.
Pinsker's and Assouad's lemmas invoke total variation or Hamming-based constructions for functional or combinatorial parameter spaces.

A canonical recipe involves constructing a finite parameter subset with minimal separation $\eta$ under the loss, bounding the Bayes risk using covering numbers and informativeness (e.g., $\chi^2$ or KL), and translating this into a minimax lower bound proportional to $\ell(\eta/2)$ , where $\ell$ is the marginal loss (Guntuboyina, 2010).

These approaches apply broadly to parametric, nonparametric, and high-dimensional models, allowing tight minimax lower bounds even under nonstandard loss or structural regimes (Chen et al., 2014, Ramdas et al., 2016).

2. Local, Asymptotic, and Non-asymptotic Minimax Theory

The sharp local asymptotic minimax (LAM) theorem of Hájek and Le Cam provides the exact asymptotic lower bound for smooth (differentiable) functionals under regular experiments: $\liminf_{n \rightarrow \infty} \inf_{\widehat\psi_n} \sup_{\|\theta - \theta_0\| \le c n^{-1/2}} n\, \mathbb{E}_{\theta}[ (\widehat\psi_n - \psi(\theta))^2 ] \geq \nabla\psi(\theta_0)^T \mathcal{I}(\theta_0)^{-1} \nabla\psi(\theta_0),$ where $\mathcal{I}(\theta)$ is the Fisher information (Takatsu et al., 2024). However, this framework is limited to differentiable functionals and locally regular models.

Recent advances have broadened minimax lower bounds to non-differentiable functionals, irregular models, or nonsmooth loss by relying on generalized mixture inequalities utilizing Hellinger or $\chi^2$ divergences. These bounds eschew explicit differentiability, remaining valid and often sharp in highly nonregular or boundary-influenced regimes (Takatsu et al., 2024).

Explicitly, the Hellinger-mixture lower bound for any prior $Q$ and measurable estimator $T$

$\inf_T \sup_{\theta \in \Theta} \mathbb{E}_{P_\theta} \|T(X) - \psi(\theta)\|^2 \geq \sup_{h \in \mathbb{R}^d} \left[ \sqrt{A_{Q,\psi}(h)} - \sqrt{B_{Q,\psi}(h)} \right]_+^2$

with $A_{Q,\psi}(h)$ and $B_{Q,\psi}(h)$ as prior-difference/denominator and prior-average/numerator terms relative to the Hellinger divergence, fully recovers the sharp asymptotic constants in the regular case, while remaining valid for irregular, nonparametric, or directionally differentiable settings.

This generalized approach recovers the classical LAM bound, van Trees (Bayesian Cramér–Rao), Chapman–Robbins, and Hammersley–Chapman–Robbins bounds as special cases by appropriate choices of prior and divergence (Takatsu et al., 2024).

3. Extensions: Loss Functions, Structural Constraints, Communication, and Privacy

Minimax lower bounds have been sharply extended to general loss functions beyond $L_2$ , models under sparsity or low-rank constraints, structured matrix/tensor factorization, and settings with distributed data, privacy, or communication bottlenecks.

Non-quadratic Loss and Functionals: Via Efroimovich's entropy-based inequalities, the van Trees inequality is generalized to $L_q$ losses for general $q \ge 1$ (Chen et al., 2024). Under regularity, for any estimator $\widehat\theta$ ,

$\mathbb{E}_\pi \|\widehat\theta - \theta\|_q^q \geq \frac{\sqrt{2 \pi e}}{C_{ME}(q)^q} \Big( |I_X(\theta)|^{1/d} + J(\pi)^{1/d} \Big)^{-q/2},$

where $J(\pi)$ is the Fisher information of the prior and $C_{ME}(q)$ is the constants from the maximum-entropy distribution under the $q$ -th moment constraint.

High-dimensional Constraints: Modern lower bound techniques capture the phase transitions under sparsity ( $\ell_0$ ), low-rank constraints, or Kronecker/tensor structure. For example, the minimax risk for $\ell_0$ -sparse linear regression is

$\Omega\left( \frac{\sigma^2\, k\,\log(d/k)}{n} \right),$

and for learning a Kronecker-structured dictionary,

$R^*(N) \gtrsim \min\left\{ p, \frac{r^2}{K}, \frac{1}{NK\,\mathrm{SNR}\, \sum_k m_k p_k} \right\}$

where the dimension-sum $\sum_k m_k p_k$ replaces the full parameter count $mp$ , yielding potentially exponential savings (Shakeri et al., 2016, Shakeri et al., 2016).

Distributed and Private Estimation: Under locally differentially private (DP) protocols or finite communication, minimax lower bounds can be expressed in terms of constrained Fisher information. For a mean estimation task,

$R_{L_q}(\Theta) \gtrsim d\, \kappa(q) \max \left\{ \left( \frac{d}{n\, \min\{\varepsilon, \varepsilon^2\}}\right)^{q/2}, \left( \frac{1}{n\, \min\{e^\varepsilon, (e^\varepsilon-1)^2\}} \right)^{q/2} \right\}$

with matching (up to constants) rates holding for blackboard, sequential, and even non-interactive protocols (Chen et al., 2024).

4. Minimax Lower Bounds in Specific Statistical Problems

Rigorous minimax lower bounds have been developed and established as tight in diverse problem classes:

Function Estimation on Graphs: For regression or classification on a graph $G_n$ with Laplacian $L$ , if $f$ or $\rho$ is $\beta$ -smooth in Laplacian sense, the minimax rate is (Kirichenko et al., 2017)

$n^{-2\beta/(2\beta + r)}$

where $r$ is the “dimension” parameter from spectral geometry.

Reinforcement Learning (RL): In finite episodic MDPs (possibly non-stationary), the minimax sample complexity and regret lower bounds are (Domingues et al., 2020)

$\Omega\left( \frac{H^3 S A}{\epsilon^2} \log(1/\delta) \right),\qquad \Omega\left( \sqrt{H^3 S A T} \right)$

for best policy identification and cumulative regret, respectively.

Testing and Independence: For high-dimensional independence testing, any procedure with nontrivial power requires (Ramdas et al., 2016)

$n \gtrsim \frac{\sqrt{pq}}{\|\Sigma_{XY}\|_F^2}$

where $\Sigma_{XY}$ is the cross-covariance.

Matrix and Tensor Completion: Noisy matrix completion under sparse factor models (per-element MSE) is bounded by (Sambasivan et al., 2015)

$R^* \geq C\, \min\left\{ s\,A_{\max}^2, \sigma^2 \frac{mr + sn}{N} \right\}$

where $m, n$ are matrix dimensions, $r$ the rank, $s$ the sparsity, and $N$ the number of samples.

Density Functionals and $L_p$ -norms: For estimation of $\|f\|_p$ , the rates split by whether $p$ is integer (parametric thresholds) or not, with regimes detailed by the Nikolskii smoothness; for non-integer $p$ , an extra logarithmic penalty appears (Goldenshluger et al., 2020).

5. Minimax Quantiles and High-Probability Lower Bounds

Expectation-based minimax risk fails to capture tail risk or guarantee high-confidence performance. Recent developments introduce minimax quantiles: $M(\delta) = \inf_{\widehat\theta} \sup_{\theta \in \Theta} Q_{1-\delta}(\widehat\theta, \theta),$ where $Q_{1-\delta}$ is the $(1-\delta)$ -quantile of the loss (Ma et al., 2024, Bongole et al., 7 Oct 2025). High-probability versions of Le Cam and Fano’s lemmas relate testing complexity to quantile risk, establishing lower bounds for quantiles at all confidence levels. Quantile-to-expectation conversions guarantee

$R^* \geq \delta\, M(\delta)$

for any $\delta\in(0,1]$ , so quantile lower bounds immediately imply expectation-level ones, but refine the understanding of rare-event or worst-case instance performance.

6. Modern Developments: Irregular Models, Nonparametric Functionals, and Tightness

Recent work has unified the Fano, Assouad, Le Cam, and van Trees lower bounds through a mixture-based approach, which, by a careful selection of divergence (Hellinger, $\chi^2$ ) and perturbation families, yields explicit non-asymptotic or local-asymptotic constants valid for irregular or directionally differentiable functionals (Takatsu et al., 2024, Merhav, 2024). For estimation of the density at a point or non-smooth functionals (e.g., $\psi(\theta)=\max(0,\theta)$ ), these generalized mixture bounds yield rates and constants that match those achievable by adaptive estimators, even when classical regularity fails.

Notable advantages of modern minimax lower bound techniques include:

No requirement for differentiability of the functional or regularity of the underlying statistical experiment.
Uniform applicability to vector or scalar parameters, convex/symmetric loss, and arbitrary moment-type losses.
Reproducibility of all classical rate boundaries (parametric, nonparametric, semiparametric), as well as new logarithmic or extra factor corrections induced by problem structure.

7. Practical Implications and Outlook

Minimax lower bounds serve as essential benchmarks for algorithmic and statistical optimality in high-dimensional inference, modern nonparametric estimation, distributed and private learning, reinforcement learning, and structured prediction. Their general methodologies and variants—spanning information-theoretic, entropy, and divergence-based formulations—offer unified pathways both to fundamental impossibility results and to sharp guidance for the design of statistically efficient and robust procedures across a spectrum of classical and emerging problems.

References:

(Kirichenko et al., 2017): Minimax lower bounds for function estimation on graphs
(Domingues et al., 2020): Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
(Shakeri et al., 2016): Minimax Lower Bounds on Dictionary Learning for Tensor Data
(Sambasivan et al., 2015): Minimax Lower Bounds for Noisy Matrix Completion Under Sparse Factor Models
(Ramdas et al., 2016): Minimax Lower Bounds for Linear Independence Testing
(Chen et al., 2014): On Bayes Risk Lower Bounds
(Guntuboyina, 2010): Lower bounds for the minimax risk using $f$ -divergences and applications
(Chen et al., 2024): $L_q$ Lower Bounds on Distributed Estimation via Fisher Information
(Takatsu et al., 2024): Generalized van Trees inequality: Local minimax bounds for non-smooth functionals and irregular statistical models
(Ma et al., 2024): High-probability minimax lower bounds
(Bongole et al., 7 Oct 2025): Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making
(Kamalaruban et al., 2018): Minimax Lower Bounds for Cost Sensitive Classification
(Merhav, 2024): Two New Families of Local Asymptotically Minimax Lower Bounds in Parameter Estimation
(Goldenshluger et al., 2020): Minimax estimation of norms of a probability density: I. Lower bounds
(Bellec, 2017): Optimistic lower bounds for convex regularized least-squares