Papers
Topics
Authors
Recent
Search
2000 character limit reached

Conformal Test Martingales (CTMs)

Updated 10 January 2026
  • CTMs are methods within the conformal prediction framework that transform conformity scores into p-values and martingale processes to evaluate exchangeability in data streams.
  • They leverage the uniform distribution of p-values under exchangeability and employ Ville’s inequality to assess significant deviations from expected behavior.
  • CTMs may fail to detect A-cryptic change-points when distributional shifts leave the conformity score law invariant, highlighting inherent limitations in their detection power.

Conformal Test Martingales (CTMs) are a foundational methodology within the conformal prediction framework for sequential testing of the exchangeability assumption in data streams. They operationalize exchangeability validation by converting sequences of conformity scores—derived from domain-specific or oracle-based measures—into p-values, and subsequently into stochastic processes (martingales) that serve as detectors of departures from the uniformity enjoined by exchangeability. Recent research has rigorously elucidated both the strengths and intrinsic limitations of CTMs, uncovering classes of distributional shift that are fully cryptic to this methodology even for oracle measures, and thus establishing a precise boundary on their statistical power (Szabadváry, 3 Jan 2026).

1. Formal Definition and Theoretical Basis

A CTM is defined relative to a sequence of data points Z1,Z2,Z_1, Z_2, \dots and a conformity (or nonconformity) measure AA selected by the user. The measure AA maps each new point and the historical batch to a real-valued score, typically representing the “typicality” or density (oracle case) of the data conditional on the sequence so far. Each data point ZtZ_t is mapped by AA to a “smoothed” conformal p-value pt(0,1)p_t \in (0, 1), which, under exchangeability of Z1,Z2,,Z_1, Z_2, \dots, yields an IID sequence uniformly distributed on (0,1)(0,1).

A conformal test martingale MtM_t is any nonnegative martingale adapted to the filtration generated by the p-value sequence, initialized at M0=1M_0 = 1. That is,

Mt=i=1tS(pi),E[Mt+1p1,,pt]=MtM_t = \prod_{i=1}^t S(p_i), \qquad \mathbb{E}[M_{t+1} | p_1, \dots, p_t] = M_t

for some score function S()S(\cdot), typically chosen to upweight small (unlikely) p-values. Key properties include:

  • Under exchangeability, MtM_t remains a true martingale and does not systematically grow.
  • By Ville’s inequality, P[supt1MtC]1/C\mathbb{P}[\sup_{t\ge 1} M_t \ge C] \le 1/C under exchangeability. Thus, an observed MtM_t greatly exceeding $1$ is strong evidence against exchangeability (Szabadváry, 3 Jan 2026).

2. Exchangeability, Uniformity, and the Limitations of the Converse

Exchangeability of the data sequence is not equivalent to the uniformity of conformal p-values, but rather strictly stronger. The classical conformal validity theorem asserts that exchangeability implies IID uniformity of the p-values: (Z1,Z2,) exchangeable    p1,p2,IID Uniform(0,1)(Z_1, Z_2, \dots) \text{ exchangeable} \implies p_1, p_2, \dots \sim \text{IID } \text{Uniform}(0, 1) However, the converse fails: one can construct sequences where exchangeability is violated, yet the resulting sequence of p-values—hence the CTM process—continues to behave as if no change occurred. This realization is termed conformal blindness or AA-crypticity, highlighting that CTMs fundamentally test only for changes that affect the p-value distribution (Szabadváry, 3 Jan 2026).

3. AA-Cryptic Change-Points and Their Explicit Construction

A central advance is the formalization of AA-cryptic change-points. For two distributions (Q0,Q1)(Q_0, Q_1) and a given conformity measure AA, the pair is AA-cryptic if, under a change from Q0Q_0 to Q1Q_1, the conformity score distribution (and hence the p-value law) is invariant: A(X,Y)=d under Q0 and Q1A(X, Y) \stackrel{d}{=} \text{ under } Q_0 \text{ and } Q_1 This condition is sufficient to guarantee that all conformal p-values, and thus any CTM, are blind to the change. The phenomenon can be realized concretely even for oracle conformity measures: e.g., in the bivariate Gaussian setup with Q0=N(μ0,Σ)Q_0 = \mathcal{N}(\mu_0, \Sigma) and Q1=N(μ1,Σ)Q_1 = \mathcal{N}(\mu_1, \Sigma), if the shift of μ1μ0\mu_1 - \mu_0 occurs precisely along the slope ρσYσX\rho \frac{\sigma_Y}{\sigma_X} determined by the covariance matrix, the conditional score A=fYXA=f_{Y|X} is identically distributed before and after the change. As a result, even a drastic change in (X,Y)(X, Y) distribution may be entirely undetectable by any CTM based on A=fYXA=f_{Y|X} (Szabadváry, 3 Jan 2026).

4. Simulation Studies Demonstrating Conformal Blindness

Empirical studies corroborate the existence and impact of AA-cryptic change-points. The canonical setup entails pre-change samples (X,Y)(X,Y) drawn IID from Q0=N((0,0),Σ)Q_0=\mathcal{N}((0,0), \Sigma), then a post-change phase from Q1=N(μ1,Σ)Q_1 = \mathcal{N}(\mu_1, \Sigma), with two regimes:

  • Non-cryptic shift (μ1\mu_1 off the cryptic line): CTM martingale explodes rapidly, p-value histogram clearly departs from uniformity, and conformal prediction intervals widen dramatically.
  • Cryptic shift (μ1=(20,10)\mu_1=(20,10) on cryptic line): p-value histogram remains perfectly uniform, martingale remains near 1, and conformal intervals stay optimally tight, despite full statistical distinction between the underlying distributions (Szabadváry, 3 Jan 2026).

This demonstrates that even unbounded deviations in data distribution can be rendered invisible to CTM if they leave the conformity-score law invariant relative to the chosen AA.

5. Implications for Conformal Testing and Methodological Remediation

The AA-crypticity phenomenon imposes a strict limitation: CTMs can—by construction—only detect departures from exchangeability that disturb the law of conformity scores as encoded by AA. Consequently, the coverage guarantees of conformal prediction under exchangeability and the detectability of genuine distributional shifts are coupled. Further, the practical power of CTMs is governed as much by the choice of conformity measure as by the underlying testing protocol.

Plausible implications: Using multiple or ensemble conformity measures may increase robustness to cryptic shifts, as at least one measure might be sensitive to the change. Parallel CTMs or copula-based joint tests on p-value vectors have been suggested to broaden detection (Szabadváry, 3 Jan 2026). Another open problem is the systematic characterization of all possible AA-cryptic pairs for a given AA, and investigation of whether adaptive (online, transductive) calibration offers additional protection against conformal blindness.

6. Comparative Summary of Properties and Applications

Aspect Under Exchangeability Under AA-Cryptic Shift (Q0Q1Q_0\rightarrow Q_1)
Conformal p-values IID Uniform(0,1) Remain IID Uniform(0,1)
CTM Martingale MtM_t Non-explosive, MtM_t stable Remains stable or drifts downward
Conformal Intervals Efficient (small width) Remain efficient
Type of detectable change Any affecting AA-score law None; shift leaves AA-score law invariant

Applications of CTMs include online change-point detection, sequential monitoring of model validity, and real-time exchangeability assessment. Their efficacy, however, is strictly constrained by the alignment of the conformity measure with the types of distributional shifts that might occur (Szabadváry, 3 Jan 2026).

7. Open Problems and Directions for Further Research

  • Systematic classification of AA-cryptic change-points for arbitrary conformity measures and data models.
  • Development of ensemble, dynamically adaptive, or joint testing frameworks that maintain coverage guarantees while mitigating conformal blindness.
  • Theoretical investigation into the limits of detectability for structured adaptive calibrators (transductive variants).
  • Quantification of practical trade-offs in prediction-set efficiency versus detection power for different conformity measures.

This body of research establishes both the sharp validity and the precise limitations of CTMs, positioning conformal blindness as a fundamental phenomenon intrinsic to the mechanism of conformal inference (Szabadváry, 3 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Conformal Test Martingales (CTMs).