Pre-training strategy using real particle collision data for event classification in collider physics

Published 12 Dec 2023 in hep-ex and physics.comp-ph | (2312.06909v1)

Abstract: This study aims to improve the performance of event classification in collider physics by introducing a pre-training strategy. Event classification is a typical problem in collider physics, where the goal is to distinguish the signal events of interest from background events as much as possible to search for new phenomena in nature. A pre-training strategy with feasibility to efficiently train the target event classification using a small amount of training data has been proposed. Real particle collision data were used in the pre-training phase as a novelty, where a self-supervised learning technique to handle the unlabeled data was employed. The ability to use real data in the pre-training phase eliminates the need to generate a large amount of training data by simulation and mitigates bias in the choice of physics processes in the training data. Our experiments using CMS open data confirmed that high event classification performance can be achieved by introducing a pre-trained model. This pre-training strategy provides a potential approach to save computational resources for future collider experiments and introduces a foundation model for event classification.