Quantile Least Squares (Q-LS)

Updated 28 January 2026

Quantile Least Squares (Q-LS) is a family of estimators that match sample quantiles to model-implied quantiles using least squares, ensuring robust parameter estimation.
It applies to various models including location-scale, regression, and instrumental variables, achieving high asymptotic efficiency and accurate tail fitting.
Q-LS methods emphasize computational scalability and robustness with bounded influence functions, making them suitable for handling outliers in large datasets.

Quantile Least Squares (Q-LS) is a broad family of parametric and semiparametric statistical estimators where the fundamental principle is to match (possibly multiple) sample quantiles or quantile relationships by least squares or related norm-minimizing criteria. Methodologies under the Q-LS designation span robust parameter estimation in location-scale families, distributional modeling, instrumental variable analysis, and robust regression (including order-statistic-based Chebyshev fits), all utilizing quantile information as the modeling basis (Adjieteh et al., 2024, Peng et al., 2023, Bertsimas et al., 2013, Puerto et al., 2024, Cherodian et al., 23 Jan 2026).

1. Core Definitions and Mathematical Structure

Let $X_1,\dots,X_n$ be i.i.d. samples, or, in the regression case, $(y_i, x_i)$ , $i=1,\dots,n$ . The Q-LS approach constructs an estimator by (i) selecting a set of quantile levels $\{p_1,\dots,p_k\}\subset (0,1)$ , (ii) computing empirical quantiles $\widehat Q_j$ at each level, and (iii) fitting model-implied quantiles—possibly parameterized, or as a linear combination of basis quantile functions—by least-squares or related loss.

Two canonical instances:

Location-scale family Q-LS estimation: Given a parametric family $F(x;\mu,\sigma)$ with $F^{-1}(p) = \mu + \sigma q_0(p)$ , define the "ordinary" Q-LS estimator (oQLS) as

$(\widehat\mu, \widehat\sigma)' = \arg\min_{\mu, \sigma > 0} \sum_{j=1}^k (\widehat Q_j - [\mu+\sigma q_0(p_j)])^2,$

and the "generalized" version (gQLS) as weighted LS with weights given by the quantile asymptotic covariance matrix (Adjieteh et al., 2024).

Quantile regression / Least Quantile of Squares (LQS): For $(y_i, x_i)$ , the LQS estimator finds

$\hat\beta^{\rm LQS}(\tau) = \arg\min_\beta |r_{(q)}(\beta)|,$

where $r_i(\beta) = y_i-x_i^T\beta$ , $q=\lceil \tau n \rceil$ , and $|r_{(q)}|$ is the $q$ th smallest absolute residual (Bertsimas et al., 2013, Puerto et al., 2024). This connects to the Chebyshev fit on subsets of size $q$ .

Other Q-LS formulations model the quantile function as a linear combination of basis quantile functions $G(p;\theta) = \sum_{i=0}^I \theta_i Q_i(p)$ , and estimate $\theta$ by constrained least squares on the fitted quantiles (Peng et al., 2023). For instrumental variables, Q-LS projects $X$ onto a dictionary of conditional quantiles $Q_{X|Z}(\tau|Z)$ , aggregating them with optimal mean-square weights for IV estimation (Cherodian et al., 23 Jan 2026).

2. Asymptotic Theory and Efficiency

Q-LS estimators rely on the joint asymptotic normality of order statistics and sample quantiles:

$\sqrt{n} (\widehat Q_j-F^{-1}(p_j))_{j=1}^k \xrightarrow{d} N_k(0,\,\Sigma),$

with $\Sigma_{ij}=p_i(1-p_j)/[f(F^{-1}(p_i))f(F^{-1}(p_j))]$ for the univariate i.i.d. case (Adjieteh et al., 2024, Peng et al., 2023).

oQLS: Standard LS asymptotics apply, but with sub-optimal variance when quantile covariances are ignored.
gQLS: Achieves semiparametric efficiency by using the covariance structure for optimal weighting; in location-scale families, achieves 90–99% asymptotic efficiency relative to MLE for practical values $k\approx 15$ –$25$ (Adjieteh et al., 2024).
Mixture Q-LS: Large- $N$ limit minimizes $L_q$ -Wasserstein distance between true and fitted quantile functions, and, under regularity, Gaussian behavior for fixed quantile grids (Peng et al., 2023).

Robust LQS regression achieves a maximal breakdown point of $(n-k+1)/n$ , e.g., 50% for least median of squares (Bertsimas et al., 2013, Puerto et al., 2024). The influence function for Q-LS estimators is bounded by the interior quantile trimming, yielding robustness controlled by trimming parameters.

3. Computational Algorithms and Scalability

Q-LS estimators are computationally tractable for moderate to very large sample sizes, with algorithmic complexity depending on specific formulation:

Location-scale Q-LS: For $k$ quantiles, total computational cost is $O(n)$ for quantile extraction plus small matrix algebra; feasible up to $n \sim 10^9$ with negligible overhead (Adjieteh et al., 2024).
Mixture Q-LS: Reduces to quadratic programming with convex or linear constraints, solved efficiently even with up to thousands of basis quantiles, and readily regularizable (e.g., $\ell_1$ , $\ell_2$ penalties) (Peng et al., 2023).
LQS regression: NP-hard in general (Puerto et al., 2024), but advances enable provable global solutions for $n\lesssim 500$ via modern MILP, bilevel, or sorting-based MIP reformulations, and subpercent optimality or high-quality solutions via continuous relaxation and warm-starts for $n$ into the low thousands (Bertsimas et al., 2013, Puerto et al., 2024). Aggregation heuristics cluster data for scalable approximate Q-LS in large- $n$ settings.
Efficient quantile regression: Recent combinatorial and randomized divide-and-conquer techniques achieve strongly polynomial algorithms in $O(n\log^2 n)$ for 2D and $O(n^{d-1}\log^2 n)$ in higher dimensions, offering order-of-magnitude speedups over traditional LP or IP solvers for large $n$ (Shetiya et al., 2023).

4. Robustness and Influence Functions

Q-LS methodologies provide intrinsic robustification due to their quantile-centric nature:

For location-scale Q-LS, the influence function for both $\widehat\mu$ and $\widehat\sigma$ is bounded by construction via interior quantile trimming (excluding extremes $[a,b]\subset(0,1)$ ), yielding a breakdown point $\min\{a,1-b\}$ (Adjieteh et al., 2024). The resulting influence functions resemble classic robust estimators: Tukey's biweight for location, and $M$ -scales for scale, but without explicit specification.
In regression, LQS estimators (e.g., least median, least trimmed squares) reach maximal breakdown for $\tau=0.5$ (median), demoting the effect of up to 50% contamination before breakdown (Bertsimas et al., 2013, Puerto et al., 2024). This is independent of dimensionality $p$ .
In the mixture-Q-LS context, monotonicity and tail constraints, as well as regularization penalties, can bolster finite-sample robustness (Peng et al., 2023).

5. Model Validation and Applications

Q-LS supports model checking and validation through quantile-based residuals and bespoke goodness-of-fit tests:

χ²-type in-sample validation (location-scale gQLS): Test statistic

$W = \frac{n}{\widehat\sigma^2}(\mathbf Y - \mathbf X\widehat\beta)'\Sigma_*^{-1}(\mathbf Y-\mathbf X\widehat\beta) \approx \chi^2_{k-2},$

controlling level and showing high power for alternatives (Adjieteh et al., 2024).

Out-of-sample validation: Resampling on test quantile grids, using a parametric bootstrap to compute $p$ -values as the null is not analytic (Adjieteh et al., 2024).
Empirical illustration: Google daily stock returns (n=986) strongly prefer the logistic over normal, Laplace, Gumbel, or Cauchy under gQLS-based tests (Adjieteh et al., 2024).
Instrumental Variables: Q-LS improves weak-instrument situations by aggregating conditional quantiles for IV estimation, enabling consistent inference even when the conditional mean is flat but distributional relevance persists (Cherodian et al., 23 Jan 2026).
Distributional modeling: Q-LS mixture frameworks outperform MLE and are markedly superior at tail fitting for financial datasets, e.g., S&P500 drawdowns (Peng et al., 2023).

6. Implementation Guidelines and Parameter Choices

Practical implementation of Q-LS hinges on quantile grid selection, trimming, weighting, and computational solver choice:

For location-scale Q-LS, $k=15$ –$25$ equally spaced, interior quantiles ( $a,b$ away from 0 and 1) are sufficient for maximal efficiency; $a=0.05$ , $b=0.95$ is recommended for moderate robustness (Adjieteh et al., 2024).
Generalized Q-LS (gQLS) weighting is always preferable due to improved efficiency at negligible extra cost (Adjieteh et al., 2024, Peng et al., 2023).
In mixture Q-LS, monotonicity (quantile gridwise) and nonnegativity constraints ensure model validity, with regularization aiding parsimony and performance in high-dimensional basis spaces (Peng et al., 2023).
For regression LQS, the indicator-based bilevel reformulation is most efficient up to $n\simeq 100$ –$200$ (Puerto et al., 2024). Aggregation heuristics cluster large $n$ to tractable size, with error bounded by cluster diameter.
For instrumental variables Q-LS, quantile grid size $K$ requires $K^2/n \to 0$ for theoretical results; ridge or LASSO regularization improves stability when many quantiles are used (Cherodian et al., 23 Jan 2026).

7. Scope, Connections, and Extensions

Q-LS unifies a spectrum of robust estimation, regression, and distributional modeling techniques under the principle of quantile-matching via norm minimization:

It generalizes M-estimation, $L_1$ -regression, least-trimmed/masked squares, and $L_q$ -Wasserstein projection.
Recent algorithmic advances make robust Q-LS estimators computationally feasible for large-scale and high-dimensional data, with strong theoretical guarantees for small- and large-sample performance.
The choice of quantile levels, weighting, trimming, and constraint structure allows Q-LS estimators to target robustness, efficiency, parsimony, or tail accuracy as dictated by the application.

Q-LS estimators therefore provide flexible, scalable, and theoretically sound alternatives to classical likelihood-based procedures, especially when distributional robustness, heavy tails, or outlier resistance are essential (Adjieteh et al., 2024, Peng et al., 2023, Bertsimas et al., 2013, Puerto et al., 2024, Cherodian et al., 23 Jan 2026).