Multi-scale cross-attention transformer encoder for event classification

Published 31 Dec 2023 in hep-ph | (2401.00452v3)

Abstract: We deploy an advanced Machine Learning (ML) environment, leveraging a multi-scale cross-attention encoder for event classification, towards the identification of the $gg\to H\to hh\to b\bar b b\bar b$ process at the High Luminosity Large Hadron Collider (HL-LHC), where $h$ is the discovered Standard Model (SM)-like Higgs boson and $H$ a heavier version of it (with $m_H>2m_h$). In the ensuing boosted Higgs regime, the final state consists of two fat jets. Our multi-modal network can extract information from the jet substructure and the kinematics of the final state particles through self-attention transformer layers. The diverse learned information is subsequently integrated to improve classification performance using an additional transformer encoder with cross-attention heads. We ultimately prove that our approach surpasses in performance current alternative methods used to establish sensitivity to this process, whether solely based on kinematic analysis or else on a combination of this with mainstream ML approaches. Then, we employ various interpretive methods to evaluate the network results, including attention map analysis and visual representation of Gradient-weighted Class Activation Mapping (Grad-CAM). Finally, we note that the proposed network is generic and can be applied to analyse any process carrying information at different scales. Our code is publicly available for generic use.