Optimal Cross-Correlation Analysis: OCCAM

Updated 30 January 2026

The paper presents OCCAM as a noise-weighted cross-correlation technique that rapidly distinguishes lensed and unlensed gravitational wave signals.
It leverages matched-filter templates and detector noise PSDs to compute a normalized statistic for low-latency candidate pre-selection.
OCCAM outperforms traditional methods with superior detection metrics (AUC ≳ 0.98) and scalable multi-detector performance.

Optimal Cross-Correlation Analysis for Multiplets (OCCAM) is a noise-weighted, optimal cross-correlation technique for the rapid identification of strongly lensed gravitational wave (GW) event pairs, specifically those resulting from compact binary coalescences (CBCs). By leveraging matched-filter templates and measured detector noise power spectral densities (PSDs), OCCAM offers a computationally efficient and mildly model-dependent method for discriminating lensed and unlensed event pairs, enabling low-latency pre-selection in large GW datasets and thus facilitating downstream application of higher-latency, more computationally intensive algorithms (Kopty et al., 29 Jan 2026).

1. Motivation and Context in Gravitational Wave Lensing Searches

The detection of strongly lensed GW pairs is of significant astrophysical interest as strong lensing causes multiple GW "images" with identical waveforms but distinct amplitudes, time delays, and discrete phase changes (Morse factors). The increasing sensitivity of GW observatories such as Advanced LIGO and Virgo makes the identification of such events an emerging necessity. However, the principal challenge lies in efficiently isolating candidate lensed pairs from a vast trigger set, given that existing approaches either demand extensive computational resources or are limited in their efficacy, especially for low-to-moderate signal-to-noise ratio (SNR) events.

Contemporary techniques are broadly separated into two categories:

Bayesian joint parameter estimation: e.g., Janquart et al. (2021), Lo & Magaña Hernandez (2023), which reliably identify lensing events but are computationally prohibitive and effective mainly at high SNR.
Lightweight heuristics: e.g., GLANCE (unweighted cross-correlation), phase-consistency, chi-square comparisons, machine learning, and posterior-overlap statistics. These often neglect the frequency structure of detector noise or impose restrictive model assumptions, yielding higher false-alarm rates and sub-optimal sensitivity.

OCCAM addresses these limitations by introducing a noise-weighted cross-correlation statistic with only mild model dependence, designed specifically to maximize discrimination between lensed and unlensed GW CBC events at minimal computational cost.

2. Mathematical Formalism and Core Statistic

OCCAM's construction begins with two detector data streams, $s_1(t) = h_1(t) + n_1(t)$ and $s_2(t) = h_2(t) + n_2(t)$ , where $h_i(t)$ are GW signals and $n_i(t)$ denote zero-mean, Gaussian detector noise processes (uncorrelated between detectors).

Central to OCCAM is a frequency-dependent modified PSD:

$\xi(t;f) = \frac{1}{2} |\tilde h_1|^2 S_{n,2} + \frac{1}{2} |\tilde h_2|^2 S_{n,1} + \frac{T}{4} S_{n,1} S_{n,2}$

where $S_{n,i}$ is the one-sided noise PSD of data stream $i$ , and $\tilde h_i(f)$ is the finite-duration Fourier transform of the template corresponding to event $i$ .

A time-dependent inner product is defined as:

$(A, B)_t = \int_{-\infty}^{\infty} \frac{\tilde A^*(t; f)\, \tilde B(t; f)}{\xi(t; f)} df$

The noise-weighted cross-correlation SNR estimator is then:

$\rho_{\mathrm{CC}} = \frac{(\tilde h_1^* \tilde h_2,\, \tilde s_1^* \tilde s_2)_t}{(\tilde h_1^* \tilde h_2,\, \tilde h_1^* \tilde h_2)_t^{1/2}}$

with expected ("optimal") SNR

$\rho_{\mathrm{opt}} = (\tilde h_1^* \tilde h_2,\, \tilde h_1^* \tilde h_2)_t^{1/2}$

For strong lensing, where $\tilde h_2(f) = \sqrt{|\mu_{\mathrm{rel}}|} \exp{[i m_\phi]} \tilde h_1(f)$ (with relative magnification $\mu_{\mathrm{rel}}$ and Morse phase difference $m_\phi$ ), the cross-correlation statistic simplifies to:

$\rho_{\mathrm{CC}}^{\max} = \frac{(|\tilde h_1|^2,\, \tilde s_1^* \tilde s_2)_t}{(|\tilde h_1|^2,\, |\tilde h_1|^2)_t^{1/2}}$

A normalized form is given by $\hat\rho_{\mathrm{CC}} = \rho_{\mathrm{CC}}^{\max}/\rho_{\mathrm{opt}}$ , yielding values in $[0,1]$ for effective pairwise discrimination.

Multi-detector networks employ a simple sum over all detector pairs:

$\hat\rho_{\mathrm{multi}} = \frac{ \sum_{i,j} \rho(s_1^{(D_i)}, s_2^{(D_j)}) }{ \sum_{i,j} \rho_{\mathrm{opt}}^2(s_1^{(D_i)}, s_2^{(D_j)}) }$

3. Stepwise Procedure and Implementation Workflow

The OCCAM algorithm is designed as a post-processing module following standard pipeline triggers:

Candidate selection: From matched-filter (MF) triggers, select event pairs with $\mathrm{SNR}_{\mathrm{MF}} \geq 10$ individually (or network SNR $\geq 8$ , with each detector $\geq 4$ ).
Relative magnification estimation: Compute $|\mu_{\mathrm{rel}}| = (\rho_{\mathrm{MF}}^{(2)}/\rho_{\mathrm{MF}}^{(1)})^2$ for candidate pairs.
Data slicing: For each event, extract a strain slice of length $[ t_{\mathrm{coa}} - \tau_{\mathrm{chirp}},\, t_{\mathrm{coa}} + 10\, \tau_{\mathrm{QNM}} ]$ , where

$\tau_{\mathrm{chirp}} = \frac{5}{256\pi\eta\,f_{\mathrm{low}}(\pi M_{\mathrm{tot}}f_{\mathrm{low}})^{5/3}}, \quad \tau_{\mathrm{QNM}} \approx 0.554 \left( \frac{M}{10M_\odot} \right) \mathrm{ms}$

Alignment: Time-align slices so that coalescence times coincide.
Frequency-domain transformation: Apply FFT to obtain $\tilde s_1(f)$ , $\tilde s_2(f)$ .
Template construction: Use the MF best-fit template of the louder event to compute $|\tilde h_1(f)|^2$ .
Numerator calculation: Evaluate $(|\tilde h_1|^2, \tilde s_1^* \tilde s_2)_t$ .
Denominator and statistic calculation: Compute $(|\tilde h_1|^2,|\tilde h_1|^2)_t^{1/2}$ , form $\rho_{\mathrm{CC}}^{\max}$ , then normalize to obtain $\hat\rho_{\mathrm{CC}}$ .
Thresholding: Flag pairs with $\hat\rho_{\mathrm{CC}}$ above threshold $\theta$ (set by analysis of backgrounds) as lensing candidates.

A schematic summary of the OCCAM workflow is given below:

Step	Data	Operation
1	Matched-filter GW triggers	Event selection/filtering
2	Trigger/meta info	Magnification estimation
3	Strain time series	Data slicing/alignment
4	Sliced strain	FFT to frequency domain
5	Best-fit MF template	Template construction
6-8	Sliced/FFT data + template + PSDs	Cross-correlation computation
9	$\hat\rho_{\mathrm{CC}}$ statistic	Candidate flagging

4. Computational Complexity and Scaling

The bulk of computational effort in OCCAM arises from performing two FFTs per event pair (over slices of length $\sim T \times$ sampling rate) and evaluating the frequency-domain inner product, resulting in $O(N_f\log N_f)$ complexity per pair. For $M$ candidate events in each selection "slot," indiscriminate pairing would scale as $O(M^2)$ , but astrophysically motivated time-delay windows reduce the number of meaningful comparisons.

Application to multi-detector networks introduces an $O(N_{\rm det}^2)$ scaling due to $N^2$ pairwise statistics, but per-pair frequency integrals are efficiently reused. The absence of Bayesian sampling or full parameter estimation ensures latency of seconds per pair on a single CPU.

5. Detection Performance and Comparative Metrics

OCCAM demonstrates significantly improved detection metrics relative to earlier rapid algorithms. For Advanced LIGO design PSD, 20 Hz low-frequency cutoff, and $\mathrm{SNR} \geq 10$ (single detector):

True positive rate: 97% detection at a false positive probability (FPP) of approximately 13%; 80% detection at FPP ≈ 7%.
ROC/AUC: Area under ROC (AUC) ≳ 0.98, outperforming alternatives such as the chi-square approach (Gholap et al. 2025) or GLANCE.
Network performance: For HLV network with $\mathrm{SNR}_{\rm net} \geq 8$ , per-detector $\mathrm{SNR}_i \geq 4$ , network ROC is significantly enhanced, allowing FPP ≲ 1% at >90% detection.

Direct comparisons indicate that OCCAM outperforms all other inexpensive methods considered, including unweighted cross-correlation (GLANCE), phase-consistency tests, chi-square classifiers, posterior-overlap statistics, and SLICK (machine learning), in both ROC and AUC.

6. Astrophysical Assumptions and Role of Ancillary Information

The evaluation of OCCAM assumes the following context:

CBC population: Power-law plus peak BBH mass distribution, merger-rate redshift evolution per Oguri (2018).
Lensing model: Singular Isothermal Ellipsoid galaxies, generating time delays from minutes to months.
Waveform models: IMRPhenomD, non-spinning; detector noise approximated as Gaussian with design PSDs.
Sky localization: No localization constraints are imposed in primary OCCAM evaluations. Incorporation of sky-map overlap (e.g., from multi-detector triangulation) would suppress FPP by 1–2 orders of magnitude, as shown in Haris et al. (2018). This suggests that future deployments incorporating such auxiliary information could further enhance OCCAM's specificity.

7. Implications, Limitations, and Prospective Applications

OCCAM's statistically rigorous noise-weighted cross-correlation approach, utilising only matched-filter triggers and PSD estimates, achieves rapid, low-cost selection of lensed event candidates. This enables downstream Bayesian or machine-learning algorithms—typically of higher computational cost and latency—to focus on a sharply reduced candidate set. OCCAM remains effective even in single-detector scenarios.

With the anticipated growth in detected CBC triggers in upcoming GW observing runs, OCCAM is poised to serve as an essential low-latency pre-selection tool, substantially increasing the feasibility of detecting strongly lensed gravitational waves with current and next-generation interferometers (Kopty et al., 29 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Optimal cross-correlation technique to search for strongly lensed gravitational waves (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Optimal Cross-Correlation Analysis for Multiplets (OCCAM).