Papers
Topics
Authors
Recent
Search
2000 character limit reached

Photon–Pion Discrimination in Colliders

Updated 15 November 2025
  • Photon–pion discrimination is the process of distinguishing single prompt photons from neutral pion decay photons by analyzing overlapping electromagnetic showers in high-energy collider environments.
  • Advanced ML methods such as BDT, DNN, and ResNet extract engineered shower-shape features and raw calorimeter cell energies, achieving AUC up to 0.965 and signal efficiencies of 84% at 10⁻³ FPR.
  • Techniques like soft scoring and auxiliary ΔR regression integrate physics insights into end-to-end CNN models, enhancing discrimination performance under high-pileup conditions.

Photon–pion (γ\gammaπ0\pi^0) discrimination refers to the classification task in high-energy physics of distinguishing single prompt photons from background photons originated from neutral pion (π0\pi^0) decays. This challenge is acute in high-luminosity collider environments, such as the LHC, where the high rate of QCD jets (π0γγ\pi^0\to\gamma\gamma) leads to overlapping electromagnetic (EM) showers within the calorimeter granularities. The regime of collimated, merged showers (small ΔRγγ\Delta R_{\gamma\gamma} below calorimeter cell size) is particularly difficult for traditional, hand-engineered variables, motivating the deployment of advanced machine learning techniques with high-granularity detector data.

1. Detector Simulations and Calorimeter Segmentation

Photon–pion discrimination studies are grounded in full Geant4-based detector simulations with realistic geometry, as realized using the COCOA-HEP simulation framework to approximate the ATLAS EM calorimeter at s=14\sqrt{s}=14 TeV. The EM calorimeter is structured into three longitudinal layers (EM1, EM2, EM3), each with distinct transverse segmentation:

  • EM1, EM2: Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.0245
  • EM3: Δη×Δϕ=0.050×0.0491\Delta\eta\times\Delta\phi=0.050\times0.0491

The hadronic (HAD) calorimeter is more coarsely segmented, with three layers (HAD1–HAD3) ranging from 0.100×0.09820.100\times0.0982 up to 0.200×0.19650.200\times0.1965 in π0\pi^00. Each candidate shower is extracted as a region of interest (ROI) approximately π0\pi^01 in π0\pi^02 and processed into tensor representations for the EM (π0\pi^03) and HAD (π0\pi^04) sections.

Signal events comprise prompt photons in the π0\pi^05 GeV range (π0\pi^06 events), while the dominant background is QCD π0\pi^07 (π0\pi^08 events), with two-thirds of π0\pi^09 decays yielding π0\pi^00, smaller than the EM1/2 cell scale—thus generating overlapping showers indistinguishable at the cell granularity.

2. High-Level Shower-Shape Variables

Traditional discrimination algorithms utilize engineered "shower-shape" variables based on integrations over specific geometrical windows of the calorimeter. Twenty such features are defined, following ATLAS conventions. Key variables include:

  • Hadronic leakage: π0\pi^01
  • Isolation fraction: π0\pi^02
  • Layer energy fractions: π0\pi^03
  • Lateral shower widths (in EM2):

π0\pi^04

with analogous π0\pi^05

  • Energy ratio (EM2): π0\pi^06 (leading two cells)
  • Leading cell separation: π0\pi^07

Additional analogous moments and ratios are constructed in both EM1 and EM3, resulting in a 20-dimensional feature vector per candidate.

3. Machine Learning Approaches: BDT, DNN, and ResNet Architectures

Three primary ML methods are benchmarked for photon–pion discrimination:

(a) Boosted Decision Tree (BDT) on Shower-Shape Variables

The BDT employs XGBoost configured with 500 trees, a maximum depth of 6, learning rate 0.1, and subsampling (0.8). Training utilizes 70% of the data (700k background, 682k signal); remaining events are reserved for testing. Binary cross-entropy (logistic loss) is used.

Typical signal efficiency at a fixed false positive rate (FPR) of π0\pi^08 is 60–65%. The area under the ROC curve (AUC) in the signal efficiency range above π0\pi^09 FPR is approximately 0.90.

(b) Dense Neural Network (DNN) on Shower-Shape Variables

The DNN uses an input of 20 engineered variables and is structured as four fully connected layers with [64 → 128 → 64 → 32] nodes, each block consisting of Linear, BatchNorm, ReLU, and Dropout (π0γγ\pi^0\to\gamma\gamma0). The output node is sigmoid-activated. Training uses Adam optimizer (lr=π0γγ\pi^0\to\gamma\gamma1, weight decay π0γγ\pi^0\to\gamma\gamma2) and binary cross-entropy loss.

DNN performance moderately exceeds the BDT, yielding about 65–70% signal efficiency at π0γγ\pi^0\to\gamma\gamma3 FPR; AUC ≈ 0.92.

(c) ResNet-Based Convolutional Neural Network on Raw Cell Energies

The ResNet is applied directly to the full granularity calorimeter cell energies. A dual-branch architecture processes EM (π0γγ\pi^0\to\gamma\gamma4) and HAD (π0γγ\pi^0\to\gamma\gamma5) tensors, each with three residual blocks comprised of:

  • Conv2D layers (π0γγ\pi^0\to\gamma\gamma6 kernel, stride 1, padding 1), filters = [32, 64, 128]
  • BatchNorm, ReLU activations, skip connections, and global average pooling

Feature vectors (length 128 each) from EM and HAD are concatenated, passed through several FC layers ([256→128→64], with ReLU and Dropout π0γγ\pi^0\to\gamma\gamma7), and a final sigmoid output. The network is trained with AdamW (π0γγ\pi^0\to\gamma\gamma8, weight decay π0γγ\pi^0\to\gamma\gamma9) and binary cross-entropy loss.

ResNet achieves substantial gains: AUC ≈ 0.96 and 80% signal efficiency at ΔRγγ\Delta R_{\gamma\gamma}0 FPR.

4. Physics-Informed Refinements: Soft-Scoring and Auxiliary ΔRγγ\Delta R_{\gamma\gamma}1 Regression

Performance is further improved by physics-informed strategies:

  • Soft scoring (label smoothing): For "hard" background events where ΔRγγ\Delta R_{\gamma\gamma}2, soft labels ΔRγγ\Delta R_{\gamma\gamma}3 in ΔRγγ\Delta R_{\gamma\gamma}4 are assigned via a Fermi–Dirac shape:

ΔRγγ\Delta R_{\gamma\gamma}5

with ΔRγγ\Delta R_{\gamma\gamma}6, ΔRγγ\Delta R_{\gamma\gamma}7, ΔRγγ\Delta R_{\gamma\gamma}8, reducing the penalization for highly overlapping backgrounds that are fundamentally ambiguous.

  • Auxiliary ΔRγγ\Delta R_{\gamma\gamma}9 regression head (multi-task): An additional regression branch predicts the opening angle s=14\sqrt{s}=140 after the main feature concatenation, using a regression loss

s=14\sqrt{s}=141

  • Combined loss: Total loss is s=14\sqrt{s}=142 with s=14\sqrt{s}=143.

These refinements push the ResNet AUC to ≈0.965 and 84% signal efficiency at s=14\sqrt{s}=144 FPR.

5. Quantitative Performance and Comparative Analysis

The following summarizes the discriminative power of the tested ML strategies for high-purity photon selection:

Method AUC Signal Efficiency @ s=14\sqrt{s}=145 FPR
BDT on shower-shape 0.90 55–65%
DNN on shower-shape 0.92 65–70%
ResNet on raw energies 0.96 80%
ResNet + Soft scoring 0.963 82%
ResNet + Soft + s=14\sqrt{s}=146 0.965 84%

Additional performance characteristics:

  • Turn-on curve: ResNet variants show a rapid turn-on at low s=14\sqrt{s}=147, reaching s=14\sqrt{s}=148 plateau by 40 GeV, outperforming BDT/DNN which plateau at 60–70%.
  • s=14\sqrt{s}=149 dependence: For Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02450, ResNet+aux improves rejection by Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.0245120% over BDT/DNN. For Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02452, all methods perform better, but the ResNet maintains a 5–10% advantage in signal efficiency at fixed background mis-ID rate.

6. Mechanisms of ResNet Superiority

ResNet architectures achieve superior discrimination by exploiting the complete 2D Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02453 shower topology across all longitudinal layers, learning subtle correlations and sub-cluster structure that evade capture by fixed, high-level moments. Residual connections enable deep feature extraction, mitigating vanishing gradients, and supporting both local and global feature learning over the calorimeter image. The augmentation via multi-task learning (auxiliary Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02454 regression) and soft-labeling directly embed physics knowledge into training, sharpening the network's focus on the most challenging, ambiguous cases.

7. Recommendations for Detector and ML Architecture Design

Prospective improvements for photon–pion discrimination include:

  • Finer transverse segmentation below Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02455 in the first EM layer, targeting resolution of highly collimated Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02456 pairs.
  • Increased longitudinal segmentation (beyond three EM layers) to utilize depth profile differences for distinguishing overlaps.
  • Hybrid models integrating tracker-based information (vertex, conversions) with calorimeter images for enhanced Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02457 rejection.
  • Physics-informed loss functions (mass constraints, opening angle regression) and systematic adoption of multi-task setups.
  • Prioritization of end-to-end learning approaches that minimize reliance on manual feature engineering, as required for the high-pileup, high-overlap environments of future colliders.

A plausible implication is that continued evolution toward architectures directly operating on raw, high-granularity calorimeter data—with explicit physics guidance—will be central to robust photon identification at the Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02458–Δη×Δϕ=0.025×0.0245\Delta\eta\times\Delta\phi=0.025\times0.02459 background rate across wide Δη×Δϕ=0.050×0.0491\Delta\eta\times\Delta\phi=0.050\times0.04910 regimes.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Photon--Pion Discrimination.