GWcosmo Pipeline for Cosmology Inference
- GWcosmo pipeline is a computational framework that integrates gravitational-wave data with galaxy and galaxy-cluster catalogs to infer cosmological parameters such as the Hubble constant H0.
- It employs a Bayesian hierarchical model and advanced completeness corrections to build accurate line-of-sight redshift priors from EM surveys, enhancing cosmological inferences.
- The pipeline is modular and extensible, facilitating robust cross-correlation with cosmological backgrounds and future integration of diverse survey data.
The GWcosmo pipeline is a set of methods and computational tools for inferring cosmological parameters, especially the Hubble constant , by combining gravitational-wave (GW) observations with electromagnetic (EM) galaxy or galaxy-cluster catalogs. Originating with works by Gray et al., its scope includes both traditional “dark siren” cosmology (where host redshifts are estimated statistically) and more recent extensions incorporating galaxy cluster information, robust catalogue completeness estimates, and cross-correlation with cosmological backgrounds. The pipeline is architected to handle the selection effects, completeness limits, and likelihood formulations that arise in GW cosmology with incomplete or multi-resolution EM datasets, achieving competitive constraints on through hierarchical Bayesian inference and advanced catalogue handling (Datrier et al., 20 Feb 2025, Beirnaert et al., 20 May 2025, Schulze et al., 2023).
1. Core Bayesian Framework and Likelihood Construction
GWcosmo adopts a Bayesian hierarchical model targeting inference on cosmological parameters (, extensions to , , modified gravity). For a set of GW events with strain data , the central posterior factorizes as:
- : user-prior over
- : single-event GW likelihood, recast from distances to redshifts via luminosity distance
- : line-of-sight redshift prior from EM catalogue(s)
- : selection-function normalization, accounting for cosmology-dependent GW detectability
Marginalizing over the individual source redshifts leads to
This formalism is adaptable—handling cluster catalogs, mixed catalogs, or hypothetical future catalogs by adjusting and the completeness model (Beirnaert et al., 20 May 2025).
2. EM Catalog Integration: Galaxy and Cluster Priors
Initially, GWcosmo utilized full-sky galaxy catalogs (e.g., GLADE, GLADE+) to supply by associating candidate galaxies (with photometric/spectroscopic ) to GW localization regions and marginalizing over possible hosts. With the introduction of robust completeness limits, the pipeline discards galaxies beyond a magnitude threshold , determined per-sky-pixel, for assigning host probabilities (Datrier et al., 20 Feb 2025).
Recent advancements adapt the pipeline to leverage galaxy cluster catalogs such as PSZ2 (Planck) and eRASS (eROSITA). Here, a “ClusterCatalogue” class constructs the LOS- prior as a sum over clusters in each GW localization pixel, weighting by cluster mass and convolving with -uncertainties (typ. Gaussian, width via physical cluster size). For sky areas or redshift ranges where the catalog is incomplete, the missing prior is supplied from a Press–Schechter mass function, normalized by measured completeness curves. The full prior is:
with and built as in (Beirnaert et al., 20 May 2025).
3. Statistical Treatment of Catalogue Completeness
GWcosmo initially used a “median-magnitude” heuristic to estimate catalogue completeness, often discarding a significant fraction of faint galaxies even when well detected. The robust method introduced in (Datrier et al., 20 Feb 2025) operationalizes the statistical test of Rauzy (2001):
- For each galaxy , the rank-ratio statistic is constructed, where (resp. ) is the count of galaxies with lower and (resp. ).
- Under the null hypothesis (catalogue complete to ), are i.i.d. uniform.
- The global is tracked; the threshold is where first drops below .
- No galaxy luminosity function is assumed; only spatial stationarity within band/redshift is needed.
- Implementation subsamples pixels for speed (e.g., galaxies per B-band HEALPix pixel, averaged over 30 runs), incorporates photometric redshift uncertainties, and produces a per-pixel map.
The pipeline then integrates this limit into the likelihood, discarding galaxies with and modifying both discrete sum and missing-mass (out-of-catalog) terms.
4. Pipeline Modularity and Workflow
The “gwcosmo” pipeline is structured for extensibility:
- Inputs: GW parameter-estimation posteriors (luminosity distance, sky position, mass, spins), EM catalogs (galaxy or cluster, with magnitude, , mass).
- Pixelation: Catalogs subdivided into HEALPix pixels ( typical).
- Completeness: Per-pixel map computed and cached.
- Likelihood Evaluation: For each GW event, the pipeline reads the map, constructs LOS- prior, and evaluates the GW likelihood, including completeness corrections. For clusters, additional Press–Schechter terms supplement out-of-catalog regions.
- Inference: The core sampling engine is unchanged, except for reading in updated priors and completeness limits; base posterior sampling over and ancillary parameters proceeds as standard.
Integration with external frameworks—such as using GW_CLASS and MontePython for anisotropy studies or joint GW+CMB cosmology—is handled via standard data flows and output chaining (Schulze et al., 2023).
5. Quantitative Performance and Improvements
Application of these methods has resulted in quantifiable gains in cosmological parameter inference:
- For GLADE+ (B-band), robust completeness cuts from the Rauzy test resulted in an narrower credible interval for dark sirens and narrower interval when combining with GW170817, compared to previous median-cut approaches (Datrier et al., 20 Feb 2025).
- Using cluster catalogs (PSZ2 and eRASS) rather than galaxies, the interval width on improved from approximately (GLADE+) to (PSZ2, gain) and (eRASS, gain), attributable to the deeper redshift reach of modern cluster catalogs (Beirnaert et al., 20 May 2025).
- For GWTC-3 BBH events in K-band (GLADE+), both median and robust methods yielded no improvement due to catalog incompleteness at the relevant distances: posteriors reverted to empty-catalogue limits with difference (Datrier et al., 20 Feb 2025).
The table below summarizes relevant uncertainties (rounded; see original works for details):
| Catalog/Band | Method | Posterior Width (km/s/Mpc) | Gain vs. Median |
|---|---|---|---|
| GLADE+, | Median | -- | |
| GLADE+, | Robust | ||
| PSZ2 | Cluster | ||
| eRASS | Cluster |
6. Selection Effects, Systematics, and Future Prospects
Selection function normalization is included in all likelihoods, reflecting the joint sensitivity to changes in and event detection probability. GWcosmo pipelines also adjust for sky coverage of catalogs (PSZ2: , eRASS: ), redshift uncertainty models (e.g., Gaussian for clusters, scatter from physical size or X-ray scaling), and EM survey incompleteness via analytic or empirical completeness curves.
Planned extensions include:
- Combining both galaxy and cluster catalogues such that well-mapped low- galaxies and high- clusters provide joined support.
- Adding a term to the LOS–z prior accounting for mergers outside clusters (voids/filaments), as neither catalog necessarily captures all hosts.
- Generalizing completeness methodology to accommodate both faint and bright magnitude limits (e.g., for DESI, LSST).
- Adapting to multi-parameter cosmology inference (, , modified gravity), achievable by replacing -only grids with multidimensional cosmology grids.
Robustness checks—varying input mass-function parameters, redshift scatter, completeness curves—have consistently shown negligible impact on posteriors.
A complete end-to-end mock data challenge, with known injected cosmological parameters, is recommended before application to the next-generation observing runs.
7. Integration with Cosmological GW Background Studies
The pipeline can be embedded in broader frameworks, connecting GW event-based inference with studies of cosmological GW backgrounds (CGWB) and their cross-correlation with CMB anisotropies (Schulze et al., 2023). The mathematical structure allows for inclusion of CGWB angular power spectra , cross-spectra with CMB temperature , and computation of projected constraints with joint CMB + GW datasets. A modified version of the CLASS Boltzmann solver (GW_CLASS) computes these observables, and interfacing with MontePython enables joint posterior sampling over both conventional and GW-specific cosmological parameters.
The GWcosmo pipeline thus constitutes a central computational framework in the emerging field of gravitational-wave cosmology, capable of adapting to new catalogs, survey depths, and parameter inference challenges as the capabilities of both GW detectors and EM surveys expand.