Papers
Topics
Authors
Recent
Search
2000 character limit reached

Generalized tension metrics for multiple cosmological datasets

Published 5 Dec 2025 in astro-ph.CO, astro-ph.IM, hep-ex, hep-ph, and physics.data-an | (2512.06086v1)

Abstract: We introduce a novel estimator to quantify statistical tensions among multiple cosmological datasets simultaneously. This estimator generalizes the Difference-in-Means statistic, $Q_{\rm DM}$, to the multi-dataset regime. Our framework enables the detection of dominant tension directions in the shared parameter space. It further provides a geometric interpretation of the tension for the two- and three-dataset cases in two dimensions. According to this approach, the previously reported increase in tension between DESI and Planck from $1.9σ$ (DR1) to $2.3σ$(DR2) is reinterpreted as a more modest shift from $1.18σ{\rm eff}$ (DR1) to $1.45σ{\rm eff}$ (DR2). These new tools may also prove valuable across research fields where dataset discrepancies arise.

Summary

  • The paper introduces a novel global tension metric that generalizes traditional methods for comparing multiple high-dimensional cosmological datasets.
  • It employs tension vectors and a symmetric dispersion tensor to quantify dataset anisotropies and define effective significance measures.
  • The proposed estimator reveals that conventional one-dimensional approaches often overstate tensions, providing a scalable tool for future cosmological analyses.

Generalized Tension Metrics for Multiple Cosmological Datasets

Introduction and Motivation

Persistent discrepancies among cosmological datasets—such as the CMB-inferred and local measurements of the Hubble constant—necessitate rigorous, multi-dimensional statistical tools to assess mutual consistency of parameter inferences. Traditional estimators based on one-dimensional marginalized posteriors (e.g., the rule-of-thumb NσN_\sigma distance between means) are limited in their ability to capture high-dimensional tensions and can significantly misrepresent the true level of disagreement, particularly as the number and quality of datasets increase. Existing global tension metrics, including the Difference-in-Means statistic QDMQ_{\rm DM}, do not adequately handle the simultaneous analysis of more than two datasets. The work presented in "Generalized tension metrics for multiple cosmological datasets" (2512.06086) introduces an estimator for quantifying statistical tension amongst multiple, potentially highly-correlated, high-dimensional posterior distributions, generalizing QDMQ_{\rm DM} and providing a geometric interpretation of multi-dataset tension.

Methodological Framework

Tension Vectors and Dispersion Tensor

Let NN datasets provide posterior distributions in a shared DD-dimensional parameter space. For every pair (i,j)(i, j), define the tension vector

rk=θˉiθˉjC^i+C^j\vec{r}_k = \frac{\vec{\bar{\theta}}_i - \vec{\bar{\theta}}_j}{\sqrt{\hat{C}_i + \hat{C}_j}}

where θˉi\vec{\bar{\theta}}_i and C^i\hat{C}_i are the mean and covariance of the iith dataset’s posterior. The set {rk}\{\vec{r}_k\} spans a parameter-difference space, and their dispersion encodes the mutual inconsistencies between datasets.

The proposed global tension estimator is

Q=1Npkrk2\mathcal{Q} = \frac{1}{N_p} \sum_k |\vec{r}_k|^2

with Np=N(N1)N_p = N(N-1) tension vectors, and the symmetric dispersion tensor

Cab=1Npkrkarkb\mathcal{C}_{ab} = \frac{1}{N_p} \sum_k r^a_k r^b_k

quantifies the geometric distribution of tensions in parameter space, with eigenvalues {λα}\{\lambda_\alpha\} and corresponding eigendirections. Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Posterior distributions for three synthetic datasets (upper panels), and the associated tension vectors in parameter-difference space (lower panels), with eigenvalues of C\mathcal{C} reflecting the strength and geometry of the multi-dataset tension.

Hypothesis Testing and Null Model

Under the null hypothesis H0H_0 (“datasets are statistically consistent”), the tension vectors should be centered at zero in the appropriately whitened parameter space. Analytically, if all covariances are identical, QH0Γ(D,1)\mathcal{Q}|_{H_0} \sim \Gamma(D,1); for more general cases, the null distribution is constructed via the joint distribution of the tension vectors, taking into account their correlations.

Geometric Interpretation and Effective Significance

For three 2D posteriors, all with identical covariances and means at the vertices of an equilateral triangle of side LL, the observed tension can be mapped to a geometric separation LL in parameter space. The authors introduce an effective significance NσeffN_\sigma^{\rm eff}, defined such that for L(Nσeff=1)=2.143L(N_\sigma^{\rm eff}=1)=2.143, the 68%68\% confidence regions are tangent, with higher NσeffN_\sigma^{\rm eff} marking corresponding overlaps of 95%95\%, 99.7%99.7\%, etc. Figure 2

Figure 2: Various configurations of three posteriors mapped onto a canonical setup of three identical covariances, with means at equilateral triangle vertices.

Figure 3

Figure 3: Dependence of PTE and NσN_\sigma on the geometric side length LL for the reference configuration, clearly delineating effective NσeffN_\sigma^{\rm eff} scales in multidimensional space.

For more than three datasets, this construction generalizes to NN vertices of a regular polygon, with NσeffN_\sigma^{\rm eff} marking the minimum pairwise separation.

Application to Synthetic and Real Cosmological Datasets

The authors demonstrate their estimator’s practical consequences using both synthetic and real cosmological posterior distributions (Pantheon+SH0ES, Planck 2018 CMB, DESI DR2 BAO, cosmic chronometers). Figure 4

Figure 4: Posterior distributions from real cosmological datasets, approximated as Gaussians in the shared parameter space.

A salient result is a systematic reduction in the tension significance when using the geometric NσeffN_\sigma^{\rm eff} in higher-dimensional spaces, relative to conventional NσN_\sigma values based on 1D analogies. For instance:

  • The NAÏVE tension between Planck and PPS is 5.68σ5.68\sigma, while the geometric method yields only 3.86σeff3.86\sigma^{\rm eff}.
  • DESI-Planck tension increase is modest: 1.18σeff1.18\sigma^{\rm eff} (DR1) \to 1.45σeff1.45\sigma^{\rm eff} (DR2), contrasting with 1.9σ2.3σ1.9\sigma \to 2.3\sigma via standard reporting.

This demonstrates that conventional methods overstate the significance of multidimensional dataset inconsistencies. The method also quantifies tension anisotropy via the eccentricity of C\mathcal{C}:

Ecc=1λminλmax\mathrm{Ecc} = \sqrt{1 - \frac{\lambda_{\min}}{\lambda_{\max}}}

with Ecc0\mathrm{Ecc}\simeq 0 indicating isotropic tension contributions and Ecc1\mathrm{Ecc}\simeq 1 highlighting dominance by a single direction—insights not accessible via scalar metrics. Figure 5

Figure 5: Tension vectors for real datasets, annotated with eigenvalues and eccentricity, elucidating anisotropic contributions to total disagreement.

Theoretical and Practical Implications

The methodology formalizes a tension estimator agnostic to the number of datasets and dimensionality of parameter space. It clarifies that conventional 1D analogies can yield misleading overestimations of tension in high-dimensional settings; the geometric interpretation enables direct mapping between significance levels and multidimensional overlaps.

Practically, the approach enables robust, scalable tension assessments as the number of cosmological probes proliferates (e.g., DESI, Euclid, LSST). Multimodal independent parameter inference efforts can directly leverage these metrics to detect model inadequacy, systematic error, or new physics only manifest in joint inference spaces. Additionally, the framework can be transposed to other domains (e.g., particle physics, astrophysics) where multi-experiment or multi-survey inconsistencies must be jointly quantified, such as in the discrepant W boson mass measurements [CDF Collaboration, 2022Sci...376..170C].

Conclusion

This work introduces a statistically rigorous, geometrically interpretable global tension estimator for simultaneous comparison of multiple high-dimensional posteriors. By exposing the limitations of standard 1D tools and providing effective, multidimensional significance scales, it lays the foundation for quantitative consistency checks critical to modern cosmology. The approach's generality suggests adoption across scientific domains where multi-experiment inference is central. Further research may extend the formalism to non-Gaussian posteriors and explore implications for experimental design and model selection in data-intensive regimes.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 5 likes about this paper.