Strong Lensing Discovery Engine

Updated 9 February 2026

Strong Lensing Discovery Engine is an integrated framework that identifies gravitational lensing systems in large-scale surveys through automated pre-selection and expert validation.
It employs deep learning, ensemble methods, and citizen science to achieve high purity and completeness, crucial for advancing cosmology and dark matter research.
The engine combines imaging and spectroscopic data with rigorous statistical calibration, enabling scalable, multi-survey analysis and robust candidate catalog creation.

A Strong Lensing Discovery Engine is a specialized, end-to-end algorithmic and human-assistance framework for the systematic extraction of strong gravitational lensing systems from large-scale astronomical data sets. These engines ingest imaging and/or spectroscopic data, implement data-driven or simulation-tuned pre-selection steps, and deploy advanced statistical, ML, or hybrid pipelines to produce catalogues of lens candidates with quantifiable completeness, purity, and selection function. The paradigm has been foundational in recent wide-area surveys—especially Euclid, DESI, CFHTLS, and the DESI Legacy Imaging Surveys—enabling lens discoveries at scales (hundreds to tens of thousands per survey) previously unattainable by manual or single-stage searches (Collaboration et al., 19 Mar 2025, Lines et al., 20 Aug 2025, Collaboration et al., 19 Mar 2025, Inchausti et al., 27 Aug 2025, Karp et al., 3 Dec 2025, Hsu et al., 19 Sep 2025, C. et al., 2022, McCarty et al., 2024, Stein et al., 2021, Sygnet et al., 2010). Strong Lensing Discovery Engines combine automated inference (deep learning, self-supervised methods), citizen science, and expert grading, and are regularly augmented by physical lens models and spectroscopic vetting. They are now essential infrastructure for survey cosmology, dark matter substructure constraints, and rare-object astrophysics.

1. Architectural Principles and Core Pipeline Structure

A prototypical Strong Lensing Discovery Engine operates via a multi-stage pipeline:

Data Ingestion and Pre-processing: Survey imaging (optical, IR, or radio) and/or spectroscopic data are cut into postage-stamp images (typical scales 10″–30″, multi-band), rescaled, masked, and standardized. For spectral engines, continuum subtraction and emission-line masking are applied (Collaboration et al., 19 Mar 2025, Karp et al., 3 Dec 2025, Collaboration et al., 19 Mar 2025, Stein et al., 2021, Sheu et al., 2023, Sygnet et al., 2010).
Candidate Pre-selection: Catalog-level cuts (e.g., magnitude, ellipticity, velocity dispersion) restrict the input to plausible deflectors; in some settings, morphological pre-selection (e.g. axis ratio, edge-on disks, SExtractor-based shape parameters) is used (Sygnet et al., 2010, Collaboration et al., 19 Mar 2025, Hsu et al., 19 Sep 2025).
Automated Candidate Identification:
- Machine Learning: Deep convolutional neural networks (CNNs), vision transformers, or self-supervised encoders output scores for each cutout. State-of-the-art examples include Zoobot, EfficientNet, ResNetV2, ConvNeXt, VGG16, Inception-v3, and unsupervised pipelines using contrastive learning (Collaboration et al., 19 Mar 2025, Inchausti et al., 27 Aug 2025, C. et al., 2022, Stein et al., 2021).
- Spectroscopic Searches: For fiber surveys, doublet line detection (e.g. [O II]) and/or spectral pairs with separated redshifts are identified in residual spectra with physical priors and probability calculations (Karp et al., 3 Dec 2025, Hsu et al., 19 Sep 2025).
- Hybrid/Ensemble Approaches: Multiple classifiers/voting schemes are fused—either via Bayesian ensemble formulas, isotonic regression-calibrated probabilities, or learning-based stackers—to maximize completeness and purity (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
Human-in-the-Loop Refinement:
- Citizen Science: Public volunteers classify high-score candidates (e.g. via Space Warps or Galaxy Hunters), with probabilistic skill calibration for each participant and hierarchical aggregation of votes (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
- Expert Grading: A small panel further inspects top candidates, assigns confidence grades (A/B/C/X), and identifies exotic or ambiguous morphologies (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025, Sygnet et al., 2010).
Lens Modelling and Physical Verification: Surviving candidates are mass-modelled (e.g. with singular isothermal ellipsoid + shear, pixel-based light modeling, PyAutoLens, Herculens) to infer lens parameters ( $\theta_E$ , mass profile, shear). Some engines loop back, refining automated steps with confirmed lens models (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
Statistical Calibration and Validation: Engine performance is assessed using injection-recovery tests (simulated lens injection), ROC curves, completeness, and purity at varying score thresholds. Cross-validation against spectroscopic samples or external imaging is pursued where feasible (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025, Karp et al., 3 Dec 2025, C. et al., 2022).

2. Algorithms, Machine Learning, and Simulation Frameworks

Discovery engines implement a variety of algorithms and learning paradigms:

Supervised Deep Learning: CNNs trained using binary cross-entropy, with positive and negative instances from simulations (arcs painted onto real survey backgrounds) and hand-labeled non-lenses. EfficientNetV2 and fine-tuned Zoobot often outperform shallow or purely supervised architectures, even with imbalanced training sets (Collaboration et al., 19 Mar 2025, Inchausti et al., 27 Aug 2025, C. et al., 2022).
Self-supervised Learning: Large-scale contrastive or generative methods generate representation embeddings (e.g., via ResNet-50 + MLP projector and InfoNCE objective), enabling similarity search and efficient few-shot classification when labeled positives are rare (Stein et al., 2021).
Ensemble Learning: Multiple independently trained models—differing in architecture, augmentation, or simulation priors—are combined using Bayesian ensembles or stacking meta-learners (shallow neural nets over model outputs) to improve precision and recall (Collaboration et al., 19 Mar 2025, Inchausti et al., 27 Aug 2025).
Semi-supervised and Generative Augmentation: MixMatch, Π-Model, and GAN-augmented data improve performance when labeled lens samples are scarce, particularly in deeper surveys or for exotic morphologies (C. et al., 2022).
Spectroscopic and Pair-wise Methods: In surveys such as DESI, pairwise association of fiber spectra with significantly different redshifts within θ_link ≃ 3″, combined with impact-parameter probability computations and SIS-based θ_E estimates, allows unbiased selection of lensing configurations invisible to photometric ML (Karp et al., 3 Dec 2025, Hsu et al., 19 Sep 2025).

3. Performance Metrics, Validation, and Yields

Performance is systematically tracked using:

Completeness (Recall): $C = \frac{N_{\rm rec}}{N_{\rm true}}$ —the fraction of true lenses recovered (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
Purity (Precision): $P = \frac{N_{\rm real}}{N_{\rm sel}}$ —the fraction of candidates that are genuine lenses, as judged by expert or spectroscopic follow-up (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
False Positive Rate, ROC and Precision-Recall Curves: Essential for tuning score thresholds for human inspection budgets and injecting simulated lenses (Collaboration et al., 19 Mar 2025, Inchausti et al., 27 Aug 2025, C. et al., 2022).
Selection Function: $S(\theta)$ measured via injection-recovery of simulated lenses empowers bias-corrected population studies (Lines et al., 20 Aug 2025).

A typical modern engine achieves:

Purity $P \simeq 50–70\%$ at $C \sim 30–60\%$ in expert-graded test sets (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).
Yields in the first 0.45% of Euclid: 500 grade-A lenses, predicting >100,000 for the full Wide Survey (Collaboration et al., 19 Mar 2025, Lines et al., 20 Aug 2025).
DESI-based engines: $\sim$ 4,110 candidates, with an expected 53% true lenses (Einstein radii $0.1''$–$4''$) (Karp et al., 3 Dec 2025).
Imaging-ML engines: discovery rates up to 1 lens per 60,000 DECaLS cutouts (Stein et al., 2021).
Spectroscopic pairwise and single-fiber approaches: uniquely sensitive to small- $\theta_E$ systems, group lenses, and “dimple lenses” (low-mass deflector/halo regime) (Hsu et al., 19 Sep 2025, Karp et al., 3 Dec 2025).
Radio survey forecast: Sky-wide yields of $\mathcal{O}(10^5)$ strong lenses for DSA-2000 and SKA (McCarty et al., 2024).

4. Specialized Discovery Pathways: Edge-on Disks, Dimple Lenses, Double-Source-Plane Lenses

Several engines target or have characterized rare, scientifically valuable subpopulations:

Edge-on Disk Lenses: Three-stage pipelines (catalog extraction, Tully–Fisher velocity dispersion proxy, visual inspections for arc geometry/color) are used to extract this sparsely populated class, with confirmed “A-class” systems showing $\sim$ 100% purity and near-unity completeness for $E \gtrsim 0.4''$ (Sygnet et al., 2010).
Dimple Lenses: Pairwise spectroscopic searches enable the systematic discovery of “dimple lenses,” where a low-mass lens creates a surface-brightness indentation in a background galaxy—probing sub-L* halos at cosmological distances (Hsu et al., 19 Sep 2025).
Double-Source-Plane Lenses (DSPLs): Engines built for Euclid leverage ML plus expert inspection to extract DSPLs, which are forecast to number in the thousands in the Euclid Wide Survey—a 100-fold increase over pre-Euclid samples, crucial for multi-plane cosmography (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025).

5. Cross-survey Generalization, Computational Scaling, and Ensemble Strategies

Modern engines are designed for dataset- and survey-agnostic scalability:

Multi-survey Compatibility: Pretraining on simulated lenses “painted” onto real galaxy backgrounds, domain-adaptive retraining, and augmentation schemes enable transfer from DECaLS, DES, DLS, Euclid, and future LSST datasets (Collaboration et al., 19 Mar 2025, Stein et al., 2021, C. et al., 2022).
Scaling to $10^8$ – $10^9$ Candidates: Efficient batching (e.g., FAISS similarity search at $10^4$ queries/second), hierarchical triage (ML → Citizen Science → Expert) and active learning keep runtime and human effort tractable (Collaboration et al., 19 Mar 2025, Collaboration et al., 19 Mar 2025, Stein et al., 2021).
Bayesian Ensemble Combination: Calibrated posterior probabilities from multiple classifiers, combined with citizen-vote outputs, yield improved purity/recall at fixed inspection budgets—crucial as manual vetting becomes infeasible at Euclid-DR1 and LSST scales (Collaboration et al., 19 Mar 2025).
Active Learning and Iterative Retraining: Engines are periodically retrained on new confirmed/false-positive labels, improving domain adaptation and selection function stability (Collaboration et al., 19 Mar 2025, C. et al., 2022).

6. Scientific Impact and Future Directions

The rise of Strong Lensing Discovery Engines is transforming lensing science:

Population-scale lens catalogs (O(10^2–10⁵⁾ secure lenses) are enabling statistical studies of dark matter substructure, halo mass functions, stellar mass–halo mass relations, and time-delay cosmography, with forecasted $H_0$ uncertainties reaching $\lesssim 1\%$ for O(100) lenses (McCarty et al., 2024, Karp et al., 3 Dec 2025, Collaboration et al., 19 Mar 2025).
Expansion to new regimes, including low-mass (dwarfs/sub-L*) lenses, double-source-plane and compound lenses, and the radio/infrared transient lens domain, is directly facilitated by these engines (Hsu et al., 19 Sep 2025, Collaboration et al., 19 Mar 2025, McCarty et al., 2024, Sheu et al., 2023).
Synergistic survey overlap (e.g., Euclid, LSST, DESI, Roman) enables cross-validation, photometric and spectroscopic redshift calibration, and further purity gains (McCarty et al., 2024).
Refinement and standardization of selection functions, probability calculations, and injection-based calibration is now central, with community-shared pipelines (e.g., PyAutoLens, Herculens) and open candidate lists supporting rapid science exploitation (Collaboration et al., 19 Mar 2025, Stein et al., 2021).
Automated triage and modeling, underpinned by learning-augmented lensing engines, will be required to harvest robust samples for next-generation cosmological and astrophysical analyses as data volume accelerates through 2030.

7. Summary Table: Representative Strong Lensing Discovery Engine Modalities

Modality	Input Data	Core Algorithm	Yield (recent)
Euclid Q1 (SLDE A–E)	VIS/NIR imaging	ML ensemble + citizen + expert	500 Grade-A lenses
DESI Single-Fiber	Optical spectroscopy	[O II] doublet in LRGs + PDF	4,110 candidates
DESI Pairwise Spectroscopic	Matched redshift pairs + DR10 imaging	Sky FoF + visual+ θ_E modeling	2,046 (conventional), 318 (dimple)
DESI Legacy Imaging (DR7–DR10)	grz imaging	ResNet, EfficientNet, Stacking	3,868 new candidates
DLS ML+SSL+GAN	BVR imaging	ResNetV2, SSL, MixMatch, GANs	>20 Grade-A/B in 20 deg²
DECaLS Self-supervised (Stein et al., 2021)	grz imaging	Contrastive Rep. Learning + kNN	1192 new candidates
DSA-2000 radio surveys	radio intensity, spectral idx	PSF deconvolution, ResNet/UNet	O(10⁵) (forecast)

These engines underpin the shift from order-unity samples in early lensing work to mass-produced, well-characterized catalogs with known selection effects, enabling next-generation science in cosmology, galaxy evolution, and dark sector physics.