Optimal Cross-Correlation Analysis: OCCAM
- The paper presents OCCAM as a noise-weighted cross-correlation technique that rapidly distinguishes lensed and unlensed gravitational wave signals.
- It leverages matched-filter templates and detector noise PSDs to compute a normalized statistic for low-latency candidate pre-selection.
- OCCAM outperforms traditional methods with superior detection metrics (AUC ≳ 0.98) and scalable multi-detector performance.
Optimal Cross-Correlation Analysis for Multiplets (OCCAM) is a noise-weighted, optimal cross-correlation technique for the rapid identification of strongly lensed gravitational wave (GW) event pairs, specifically those resulting from compact binary coalescences (CBCs). By leveraging matched-filter templates and measured detector noise power spectral densities (PSDs), OCCAM offers a computationally efficient and mildly model-dependent method for discriminating lensed and unlensed event pairs, enabling low-latency pre-selection in large GW datasets and thus facilitating downstream application of higher-latency, more computationally intensive algorithms (Kopty et al., 29 Jan 2026).
1. Motivation and Context in Gravitational Wave Lensing Searches
The detection of strongly lensed GW pairs is of significant astrophysical interest as strong lensing causes multiple GW "images" with identical waveforms but distinct amplitudes, time delays, and discrete phase changes (Morse factors). The increasing sensitivity of GW observatories such as Advanced LIGO and Virgo makes the identification of such events an emerging necessity. However, the principal challenge lies in efficiently isolating candidate lensed pairs from a vast trigger set, given that existing approaches either demand extensive computational resources or are limited in their efficacy, especially for low-to-moderate signal-to-noise ratio (SNR) events.
Contemporary techniques are broadly separated into two categories:
- Bayesian joint parameter estimation: e.g., Janquart et al. (2021), Lo & Magaña Hernandez (2023), which reliably identify lensing events but are computationally prohibitive and effective mainly at high SNR.
- Lightweight heuristics: e.g., GLANCE (unweighted cross-correlation), phase-consistency, chi-square comparisons, machine learning, and posterior-overlap statistics. These often neglect the frequency structure of detector noise or impose restrictive model assumptions, yielding higher false-alarm rates and sub-optimal sensitivity.
OCCAM addresses these limitations by introducing a noise-weighted cross-correlation statistic with only mild model dependence, designed specifically to maximize discrimination between lensed and unlensed GW CBC events at minimal computational cost.
2. Mathematical Formalism and Core Statistic
OCCAM's construction begins with two detector data streams, and , where are GW signals and denote zero-mean, Gaussian detector noise processes (uncorrelated between detectors).
Central to OCCAM is a frequency-dependent modified PSD:
where is the one-sided noise PSD of data stream , and is the finite-duration Fourier transform of the template corresponding to event .
A time-dependent inner product is defined as:
The noise-weighted cross-correlation SNR estimator is then:
with expected ("optimal") SNR
For strong lensing, where (with relative magnification and Morse phase difference ), the cross-correlation statistic simplifies to:
A normalized form is given by , yielding values in for effective pairwise discrimination.
Multi-detector networks employ a simple sum over all detector pairs:
3. Stepwise Procedure and Implementation Workflow
The OCCAM algorithm is designed as a post-processing module following standard pipeline triggers:
- Candidate selection: From matched-filter (MF) triggers, select event pairs with individually (or network SNR , with each detector ).
- Relative magnification estimation: Compute for candidate pairs.
- Data slicing: For each event, extract a strain slice of length , where
- Alignment: Time-align slices so that coalescence times coincide.
- Frequency-domain transformation: Apply FFT to obtain , .
- Template construction: Use the MF best-fit template of the louder event to compute .
- Numerator calculation: Evaluate .
- Denominator and statistic calculation: Compute , form , then normalize to obtain .
- Thresholding: Flag pairs with above threshold (set by analysis of backgrounds) as lensing candidates.
A schematic summary of the OCCAM workflow is given below:
| Step | Data | Operation |
|---|---|---|
| 1 | Matched-filter GW triggers | Event selection/filtering |
| 2 | Trigger/meta info | Magnification estimation |
| 3 | Strain time series | Data slicing/alignment |
| 4 | Sliced strain | FFT to frequency domain |
| 5 | Best-fit MF template | Template construction |
| 6-8 | Sliced/FFT data + template + PSDs | Cross-correlation computation |
| 9 | statistic | Candidate flagging |
4. Computational Complexity and Scaling
The bulk of computational effort in OCCAM arises from performing two FFTs per event pair (over slices of length sampling rate) and evaluating the frequency-domain inner product, resulting in complexity per pair. For candidate events in each selection "slot," indiscriminate pairing would scale as , but astrophysically motivated time-delay windows reduce the number of meaningful comparisons.
Application to multi-detector networks introduces an scaling due to pairwise statistics, but per-pair frequency integrals are efficiently reused. The absence of Bayesian sampling or full parameter estimation ensures latency of seconds per pair on a single CPU.
5. Detection Performance and Comparative Metrics
OCCAM demonstrates significantly improved detection metrics relative to earlier rapid algorithms. For Advanced LIGO design PSD, 20 Hz low-frequency cutoff, and (single detector):
- True positive rate: 97% detection at a false positive probability (FPP) of approximately 13%; 80% detection at FPP ≈ 7%.
- ROC/AUC: Area under ROC (AUC) ≳ 0.98, outperforming alternatives such as the chi-square approach (Gholap et al. 2025) or GLANCE.
- Network performance: For HLV network with , per-detector , network ROC is significantly enhanced, allowing FPP ≲ 1% at >90% detection.
Direct comparisons indicate that OCCAM outperforms all other inexpensive methods considered, including unweighted cross-correlation (GLANCE), phase-consistency tests, chi-square classifiers, posterior-overlap statistics, and SLICK (machine learning), in both ROC and AUC.
6. Astrophysical Assumptions and Role of Ancillary Information
The evaluation of OCCAM assumes the following context:
- CBC population: Power-law plus peak BBH mass distribution, merger-rate redshift evolution per Oguri (2018).
- Lensing model: Singular Isothermal Ellipsoid galaxies, generating time delays from minutes to months.
- Waveform models: IMRPhenomD, non-spinning; detector noise approximated as Gaussian with design PSDs.
- Sky localization: No localization constraints are imposed in primary OCCAM evaluations. Incorporation of sky-map overlap (e.g., from multi-detector triangulation) would suppress FPP by 1–2 orders of magnitude, as shown in Haris et al. (2018). This suggests that future deployments incorporating such auxiliary information could further enhance OCCAM's specificity.
7. Implications, Limitations, and Prospective Applications
OCCAM's statistically rigorous noise-weighted cross-correlation approach, utilising only matched-filter triggers and PSD estimates, achieves rapid, low-cost selection of lensed event candidates. This enables downstream Bayesian or machine-learning algorithms—typically of higher computational cost and latency—to focus on a sharply reduced candidate set. OCCAM remains effective even in single-detector scenarios.
With the anticipated growth in detected CBC triggers in upcoming GW observing runs, OCCAM is poised to serve as an essential low-latency pre-selection tool, substantially increasing the feasibility of detecting strongly lensed gravitational waves with current and next-generation interferometers (Kopty et al., 29 Jan 2026).