Papers
Topics
Authors
Recent
Search
2000 character limit reached

Conformal Interquantile Regression (CIR)

Updated 13 January 2026
  • Conformal Interquantile Regression (CIR) is a method that constructs near-minimal, distribution-aware prediction intervals with finite-sample marginal coverage.
  • It leverages interquantile regression and conformal calibration to efficiently adapt to skewed and heteroscedastic conditional distributions.
  • The approach, along with its CIR+ variant, offers strong theoretical guarantees and computational advantages over methods like CQR and CHR.

Conformal Interquantile Regression (CIR) is a method for constructing near-minimal prediction intervals in regression that guarantee finite-sample marginal coverage, achieve approximate conditional coverage under mild assumptions, and adapt to skewed, heteroscedastic conditional distributions. CIR applies conformal calibration to outcome distributions estimated via interquantile regression, yielding efficient, distribution-aware intervals. The framework is realized with fast algorithms and has been extended by the CIR+ variant and connected to other conformal prediction approaches such as Conformalized Quantile Regression (CQR), Conformal Histogram Regression (CHR), and Conformal Thresholded Intervals (CTI) (Guo et al., 6 Jan 2026, Romano et al., 2019, Luo et al., 2024, Sesia et al., 2019, Gupta et al., 2019).

1. Problem Setup and Fundamental Definitions

CIR operates in the standard regression setting: given exchangeable data {(xi,yi)}i=1n\{(x_i,y_i)\}_{i=1}^n with xiRd, yiRx_i\in\mathbb{R}^d,\ y_i\in\mathbb{R}, one splits the sample into a training set Dtrain\mathcal{D}_{\rm train}, a calibration set Dcal\mathcal{D}_{\rm cal}, and a test set Dtest\mathcal{D}_{\rm test}. The goal is to construct, for each test xx, a prediction interval C(x)RC(x)\subseteq\mathbb{R} such that the marginal coverage

P{YC(X)}1α\mathbb{P}\{Y\in C(X)\} \ge 1-\alpha

holds for a user-specified miscoverage α(0,1)\alpha\in(0,1). Ideally, the interval also achieves approximate conditional coverage P{YC(x)X=x}1α\mathbb{P}\{Y\in C(x)\mid X=x\}\ge 1-\alpha and is as short as possible.

Interquantile regression fits a black-box quantile regressor to estimate the t/Tt/T–th conditional quantile q^t(x)\hat q_t(x) for t=0,1,,Tt=0,1,\dots,T. The tt-th interquantile interval is It(x)=(q^t1(x),q^t(x)]I_t(x) = (\hat q_{t-1}(x), \hat q_t(x)]. The classical two-quantile case uses q^α/2(x)\hat q_{\alpha/2}(x) and q^1α/2(x)\hat q_{1-\alpha/2}(x) to define the interquantile range Δ(x)=q^1α/2(x)q^α/2(x)\Delta(x) = \hat q_{1-\alpha/2}(x) - \hat q_{\alpha/2}(x) (Guo et al., 6 Jan 2026, Gupta et al., 2019).

2. CIR Algorithm: Calibration and Construction

CIR’s validity arises from its calibration strategy via conformity scores. For each calibration pair (xi,yi)(x_i, y_i), CIR computes kik_i, the smallest number of consecutive interquantile bins whose union covers yiy_i: s(xi,yi)=ki=min{k:yiCk(xi)}s(x_i, y_i) = k_i = \min\{k: y_i \in C_k(x_i)\} where Ck(x)C_k(x) is the shortest union of kk adjacent It(x)I_t(x) intervals. With m=Dcalm=|\mathcal{D}_{\rm cal}|, the threshold rα=(1α)(m+1)r_\alpha = \lceil(1-\alpha)(m+1)\rceil quantifies the maximum allowed miscoverage. The calibrated interval is defined by the rαr_\alpha-th smallest score, denoted k^\hat k, and prediction proceeds by outputting CCIR(x)=(q^lk^(x),q^lk^+k^(x)]C_{\rm CIR}(x) = (\hat q_{l_{\hat k}}(x), \hat q_{l_{\hat k}+\hat k}(x)] (Guo et al., 6 Jan 2026).

The CIR+ enhancement applies a width-based tie-break for samples with identical kk. The refined fractional score is

s+(x,y)=min{k1+ek(x):yCk(x)}s^+(x, y) = \min\{k-1 + e_k(x): y \in C_k(x)\}

where ek(x)e_k(x) rescales the length of the kk-th interval. Calibration now uses the rαr_\alpha-th smallest s+s^+, decomposed as s^=s^+δ\hat s = \lfloor\hat s\rfloor + \delta; for a test point xx, CIR+ outputs either kk^* or k+1k^*+1 based on the scaled interval length, yielding narrower intervals in expectation (Guo et al., 6 Jan 2026).

3. Theoretical Guarantees

CIR guarantees finite-sample marginal coverage under the sole assumption of sample exchangeability: P{YCCIR(X)}1α\mathbb{P}\{Y\in C_{\rm CIR}(X)\} \ge 1-\alpha This is achieved by selecting the rαr_\alpha-th quantile of the conformity scores—ranks are uniform, ensuring the desired coverage level (Guo et al., 6 Jan 2026, Romano et al., 2019, Gupta et al., 2019, Sesia et al., 2019).

Asymptotic conditional coverage is attained when: data are i.i.d.; quantile estimation is consistent; and PYX=xP_{Y|X=x} is unimodal. As m,Tm,T\to\infty,

P{YCCIR(x)X=x}1α\mathbb{P}\{Y\in C_{\rm CIR}(x)\mid X=x\}\to 1-\alpha

and the length of CCIR(x)C_{\rm CIR}(x) approaches the oracle shortest interval. Unimodality ensures smallest unions are nested, supporting optimality in interval construction (Guo et al., 6 Jan 2026, Gupta et al., 2019).

4. Computational Efficiency and Comparisons

CIR’s efficiency arises from bypassing histogram construction (as in CHR) and limiting calibration to O(mT)O(mT) operations (check at most TT bins per calibration sample), with O(T)O(T) test-time complexity per point. For comparison:

Method Calibration Complexity Test-Time Complexity Adaptivity to Skewness
CIR O(mT)O(mT) O(T)O(T) Strong
CQR O(m)O(m) O(1)O(1) Weak
CHR O(mT+T2)O(mT + T^2) O(T2)O(T^2) Strong

CIR’s algorithmic cost matches CQR (for multi-quantile regression) and greatly surpasses CHR in speed, especially for large TT (Guo et al., 6 Jan 2026, Luo et al., 2024).

5. Empirical Evaluation

Extensive experiments on synthetic and real-world datasets benchmark CIR and CIR+ against CQR, CHR, DistSplit, DCP, and DCP-CQR. On synthetic data with heteroscedastic, asymmetric, and jump noise models, CIR/CIR+ replicates CHR’s minimal intervals and coverage, while consuming only 1–5% of CHR’s total computation time. CQR yields noticeably wider intervals under skewed distributions.

Real dataset evaluations span seven UCI/MEPS regression tasks. CIR+ typically yields the shortest or near-shortest intervals for both neural-net and random-forest quantile regressors, maintaining marginal coverage near 90% and providing strong conditional coverage. Calibration and prediction are up to 100× faster than histogram-based CHR, and CIR+ intervals are narrowest in most splits (Guo et al., 6 Jan 2026, Luo et al., 2024, Sesia et al., 2019).

6. Connections to Other Methods

CIR generalizes and extends prior conformal regression approaches. Conformalized Quantile Regression (CQR) (Romano et al., 2019, Sesia et al., 2019) is recovered in the two-quantile case, constructing intervals [qα/2(x)Q^1α, q1α/2(x)+Q^1α][q_{\alpha/2}(x)-\widehat Q_{1-\alpha},\ q_{1-\alpha/2}(x)+\widehat Q_{1-\alpha}] with Q^1α\widehat Q_{1-\alpha} the empirical conformal quantile of the nonconformity scores. CQR is theoretically valid but adapts poorly to skewed distributions.

Conformal Histogram Regression (CHR) achieves strong distribution adaptivity via histogram binning but at the cost of O(T2)O(T^2) computational complexity (Luo et al., 2024). CIR is “CHR without histogram”: it finds minimal unions of quantile intervals for coverage, directly leveraging multi-quantile regression without explicit density estimation.

Conformal Thresholded Intervals (CTI) calibrate by thresholding interquantile interval lengths; the threshold is set to the (1α)(1-\alpha)-quantile of interval-length scores, approximating Neyman–Pearson optimality (Luo et al., 2024). CIR’s fractionally-scored enhancement in CIR+ is similar in spirit.

Nested conformal frameworks and their cross-conformal, OOB, and jackknife+ extensions further unify these approaches, subsuming all such nonconformity scores under the calibration of rank-based level sets for prediction (Gupta et al., 2019).

7. Limitations, Extensions, and Best-Use Scenarios

CIR requires high-quality multi-quantile regressors; poor quantile estimation or limited data degrade interval quality and conditional validity. CIR’s asymptotic optimality for conditional coverage relies on the unimodality of PYX=xP_{Y|X=x}, though marginal coverage is achieved regardless.

CIR and especially CIR+ are suited for large-scale regression tasks with skewness and heteroscedasticity, settings that demand per-point interval adaptivity without the prohibitive cost of histogram-based methods. Fast recalibration under concept shift is naturally supported, especially with nonparametric quantile regression backends.

Extensions such as the CTI approach and ensemble methods (e.g., QOOB) continue to generalize CIR by deploying aggregation, cross-conformalization, and uniform marginal coverage under exchangeability (Gupta et al., 2019, Luo et al., 2024).


References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Conformal Interquantile Regression (CIR).