Explainable Particle Chebyshev Network
- E-PCN is a deep graph neural network that models jets as particle graphs using Chebyshev spectral convolutions and EdgeConv layers.
- It employs four parallel branches weighted by distinct Lund-plane kinematic features to capture jet substructure with precision.
- Integrating Grad-CAM explainability, E-PCN quantitatively attributes classification decisions to underlying physical features.
The Explainable Particle Chebyshev Network (E-PCN) constitutes an advanced graph neural network (GNN) framework optimized for jet tagging tasks in experimental high-energy physics. E-PCN enhances the interpretability and discrimination power of deep graph-based classifiers by simultaneously encoding multiple kinematic relationships via parallel spectral graph branches, each derived from distinct jet substructure measures. The architecture incorporates both Chebyshev spectral convolutions and EdgeConv layers over kinematically weighted particle graphs, allowing explicit attribution of classification decisions to underlying physical features via a Grad-CAM–derived approach (Islam et al., 8 Dec 2025).
1. Foundations: Particle Chebyshev Networks and Jet Graph Representation
Particle Chebyshev Networks (PCN) model jets as undirected graphs , where is the set of detected constituent particles and encodes proximity relations in pseudorapidity–azimuth () space. Each node is instantiated with a –dimensional feature vector encompassing momentum components (), energy (), transverse momentum (), spatial coordinates (), impact parameters, and particle identification flags.
Edges join each particle to its nearest neighbors determined in using a KD-tree. The resulting adjacency matrix is utilized to construct the graph Laplacian , with the degree matrix. To enable spectral graph convolutions via Chebyshev polynomials, is rescaled to , with typically approximated as $2$.
Chebyshev convolution applies polynomial filters recursively:
A single ChebConv layer transforms input node signals as:
with learnable parameters over polynomial orders .
2. E-PCN Architecture: Multi-Kinematic Graph Branches
E-PCN advances PCN by constructing four parallel graph views, each weighted by a specific Lund-plane–inspired kinematic variable: angular separation (), transverse momentum (), momentum fraction (), and invariant mass squared (). For each edge , the following are computed in logarithmic form:
Each kinematic feature's logarithm () is used to re-weight the base adjacency:
Each weighted graph is processed through a dedicated GNN branch with five layers alternating ChebConv (, hidden dim $64$) and EdgeConv, followed by BatchNorm and ReLU. Node features are mean-pooled to obtain per-graph embeddings . These are stacked into a tensor and further processed by a $1$D convolution across graph channels, then flattened to a $256$-dimensional feature, followed by two fully connected layers (; jet class logits).
Key hyperparameters include hidden dimension $64$, polynomial order $16$, $4$ parallel branches, nearest neighbors, AdamW (lr=), batch size $256$, and $0.1$ dropout.
3. Grad-CAM–based Explainability in E-PCN
Interpretability is achieved by adapting Gradient-weighted Class Activation Mapping (Grad-CAM) to the GNN context. For output pre-softmax class score , importance weights for graph type are calculated as:
with as a normalization constant. The class-specific edge activation map is:
Global branch importance is estimated by averaging or across all test instances. The normalized contributions are:
- : 40.72%
- : 35.67%
- : 14.06%
- : 9.54%
This quantifies the relative impact of each kinematic feature on classifier decisions, offering direct physical interpretation.
4. Empirical Performance and Kinematic Attribution
On the JetClass dataset (10 classes, 1M training jets), E-PCN yields:
- Macro-Accuracy:
- Macro-AUC:
- Macro-AUPR:
In comparison, the baseline PCN achieves:
- Macro-Accuracy:
- Macro-AUC:
- Macro-AUPR:
The table below summarizes performance:
| Model | Macro-Accuracy | Macro-AUC | Macro-AUPR |
|---|---|---|---|
| PCN (baseline) | 92.49% | 92.94% | 65.99% |
| E-PCN | 94.67% | 96.78% | 82.41% |
Grad-CAM analysis reveals angular separation () and transverse momentum () account for approximately of classification decisions, corroborating the Lund-plane driven hypothesis that soft–collinear QCD dynamics are most discriminative in jet substructure. Momentum fraction () and invariant mass () contribute complementary discrimination, especially for heavy flavor processes (e.g., shows elevated branch importance near ).
5. Significance and Implications
The E-PCN framework combines interpretable graph-based learning with physically motivated kinematic encoding, enabling identification of salient features underpinning jet classification tasks. The clear attribution facilitated by kinematic weighting and Grad-CAM–based analysis validates the soft–collinear structure hypothesis and enables domain experts to link model outcomes to QCD substructure intuition.
A plausible implication is that similar multi-branch graph architectures can generalize to other areas where interpretability and domain-based feature attribution are crucial. The explicit quantification of kinematic variable importance supports data-driven theoretical investigations and systematic studies of signal/background separation mechanisms.
6. Connections to Related Methodologies
E-PCN operationalizes Chebyshev spectral graph convolutions [ChebConv], edge-based convolutions [EdgeConv], and Grad-CAM attribution strategies, consistent with formal definitions from prior literature. The multi-graph approach aligns conceptually with Lund-plane jet analysis, emphasizing the integration of domain knowledge into machine learning workflows for collider physics.
The architecture and methodology follow the notation and algorithmic conventions established in the foundational E-PCN publication (Islam et al., 8 Dec 2025). Implementation fidelity requires adherence to prescribed hyperparameters, training schedules, and preprocessing techniques specified in that work.