Decoding Jet Signals: How Machine Learning Is Refining Particle Physics

Author: Denis Avetisyan


The ATLAS experiment is leveraging cutting-edge machine learning techniques to improve the identification and classification of hadronic jets, fundamental building blocks in high-energy particle collisions.

The ParT algorithm demonstrates performance scalability-as assessed against EFN, PFN, and ParticleNet-with its efficacy directly linked to both jet identification efficiency [latex]\epsilon_{sig}[/latex] and jet transverse momentum [latex]p_T[/latex].
The ParT algorithm demonstrates performance scalability-as assessed against EFN, PFN, and ParticleNet-with its efficacy directly linked to both jet identification efficiency [latex]\epsilon_{sig}[/latex] and jet transverse momentum [latex]p_T[/latex].

This review details recent advancements in jet tagging using machine learning, with a focus on constituent-based models like transformers and graph neural networks within the ATLAS Collaboration.

Precisely identifying the origins and constituents of hadronic jets remains a central challenge in high-energy physics, limiting the precision of measurements at the Large Hadron Collider. This paper, ‘Classifying hadronic objects in ATLAS with ML/AI algorithms’, reviews recent advancements in machine learning techniques employed by the ATLAS Collaboration to address this issue. Constituent-based tagging architectures, particularly those leveraging graph neural networks and transformer-based approaches, demonstrate superior performance in classifying jets and identifying hadronically decaying heavy objects. Will these data-driven strategies pave the way for model-independent jet tagging and ultimately unlock new physics discoveries?


Unraveling the Jet Puzzle: A Quest for Flavor

At the Large Hadron Collider, physicists analyze the debris from proton collisions to understand the fundamental building blocks of matter. These collisions produce ā€œjetsā€ – sprays of particles – and determining whether a jet originated from a quark or a gluon – its ā€œflavourā€ – is surprisingly complex yet vitally important. Precise measurements of the Standard Model and searches for new physics rely heavily on accurately identifying jet flavour; misidentifying a quark jet as a gluon jet, or vice-versa, introduces systematic uncertainties that can mask or falsely suggest new discoveries. The subtle differences in how quarks and gluons interact within a jet manifest as slight variations in the energy distribution and particle composition, requiring sophisticated analysis techniques to disentangle these signals and ultimately refine the precision of particle physics measurements.

Early attempts to categorize jets relied heavily on Boosted Decision Trees, a machine learning technique proving adequate for initial analyses at the Large Hadron Collider. However, as data collection advanced and collision events became increasingly complex, these methods began to exhibit limitations. The sheer volume of information within modern datasets, coupled with the subtle distinctions between quark- and gluon-initiated jets, strained the capacity of these trees to maintain sufficient accuracy. Specifically, the hand-tuned features used to train the models struggled to capture the nuances present in high-energy collisions, leading to increased misidentification rates and hindering precise measurements of fundamental particle properties. Consequently, researchers recognized the necessity for more sophisticated techniques capable of automatically learning relevant features and adapting to the evolving demands of high-energy physics data analysis.

Historically, identifying the ā€˜jet flavour’ – the type of quark or gluon initiating a particle jet – depended heavily on physicists manually designing specific features from the detector data to distinguish between them. This process, while initially effective, presents a significant limitation in the evolving landscape of high-energy physics. These hand-engineered features are not inherently flexible; adapting them to changes in detector configurations, or to the influx of larger, more complex datasets from the Large Hadron Collider, requires substantial effort and often compromises performance. The rigidity of these traditional methods contrasts sharply with the dynamic nature of particle physics experiments, where data characteristics and detector conditions are constantly changing, necessitating more adaptable and automated approaches to jet flavour tagging.

Despite slightly reduced background rejection, the [latex]LundNetANN[/latex] tagger effectively decouples performance from jet mass and more accurately models QCD jet contributions compared to the standard [latex]LundNet[/latex] approach.
Despite slightly reduced background rejection, the [latex]LundNetANN[/latex] tagger effectively decouples performance from jet mass and more accurately models QCD jet contributions compared to the standard [latex]LundNet[/latex] approach.

Graph Neural Networks: Mapping the Inner Structure of Jets

In high-energy physics, jets – collimated sprays of particles resulting from quark or gluon interactions – are commonly analyzed using machine learning techniques. Graph Neural Networks (GNNs) provide a method for representing these jets as graphs, where individual particles within the jet are defined as nodes. The relationships – such as spatial proximity, momentum correlations, and energy flow – between these particles are then represented as edges connecting the nodes. This graph-based representation allows the GNN to explicitly model the internal structure of the jet, moving beyond treating jets as simple collections of individual particle features and instead focusing on the interconnectedness of their constituents. The node features typically include particle type, momentum, and energy, while edge features can encode the distance or relative momentum between particles.

Traditional jet tagging methods typically rely on hand-engineered features summarizing particle information, often losing information about particle interrelationships. Graph Neural Networks (GNNs) address this limitation by directly operating on a graph representation of the jet, where particles are nodes and their four-momentum-based relationships define the edges. This allows the network to learn complex correlations between constituent particles, including those arising from boosted decays or internal jet substructure. By propagating information between nodes based on their connections, GNNs can effectively capture multi-particle interactions and identify subtle patterns indicative of specific jet origins that are difficult to encode using fixed, pre-defined features. Consequently, GNNs demonstrate an improved ability to discriminate between different jet types, particularly in challenging scenarios with high pile-up or low signal strength.

ParticleNet is a graph neural network architecture specifically designed for particle physics applications, and has demonstrated improvements in jet tagging performance. Evaluations on benchmark datasets show that ParticleNet achieves superior quark/gluon discrimination compared to traditional methods relying on hand-engineered features. Furthermore, the model exhibits significant gains in top-quark identification, a particularly challenging task due to the complex decay topologies and high background rates. These performance improvements are attributable to the model’s ability to learn and utilize the relationships between constituent particles within a jet, leading to more effective jet categorization.

Traditional jet tagging relied heavily on hand-engineered features designed to isolate signal from background. This approach limits adaptability to different jet types and potential improvements as detector capabilities evolve. Utilizing learned representations, as implemented in Graph Neural Networks, bypasses this limitation by allowing the model to automatically discover relevant features from the raw data. This data-driven approach enhances robustness because the model isn’t dependent on pre-defined, potentially suboptimal, features. Furthermore, the learned representations generalize more effectively to unseen data or variations in detector conditions, improving performance across different experiments and analyses without requiring substantial re-engineering of features.

The DeParT transformer-based algorithm demonstrates improved gluon-jet rejection [latex]\epsilon^{-1}_{g}[/latex] as a function of both quark identification efficiency [latex]\epsilon_{q}[/latex] and transverse momentum [latex]p_{T}[/latex], outperforming architectures like FC DNN, PFN, EFN, and ParticleNet.
The DeParT transformer-based algorithm demonstrates improved gluon-jet rejection [latex]\epsilon^{-1}_{g}[/latex] as a function of both quark identification efficiency [latex]\epsilon_{q}[/latex] and transverse momentum [latex]p_{T}[/latex], outperforming architectures like FC DNN, PFN, EFN, and ParticleNet.

Decoding Jet Histories: The Lund Plane and Beyond

The Lund Jet Plane is a visualization technique that represents the history of jet clustering through a 2D map. This map encodes information about the energy flow within a jet as it develops from initial partons. Specifically, the plane plots the logarithmic transverse momentum [latex]log_{10}(p_{T})[/latex] of each constituent against its relative azimuthal angle [latex]\Delta\phi[/latex] with respect to the jet axis. The resulting distribution reveals the hierarchical structure of the jet, with earlier branching occurring at larger radii and higher [latex]p_{T}[/latex] values, and later branching at smaller radii and lower [latex]p_{T}[/latex] values. This detailed encoding of jet formation allows for a more nuanced understanding of the underlying parton shower dynamics and provides a direct input for machine learning algorithms designed to analyze jet substructure.

LundNet is a machine learning model that utilizes the Lund Jet Plane – a visualization of jet clustering history – as input to a Graph Neural Network (GNN). This approach enables the model to directly learn from the sequential process of jet formation, where each node in the GNN represents a constituent particle and edges represent the clustering history as defined by the Lund Jet Plane. By representing the jet as a dynamic graph, LundNet captures information about the energy flow and branching structure during the parton shower, allowing it to differentiate between various jet production mechanisms and improve jet tagging performance compared to models relying solely on final-state jet properties.

LundNetANN improves jet tagging performance through adversarial training, a technique designed to reduce correlation between the tagger’s output and the jet’s mass. This is achieved by introducing an adversarial network that attempts to predict jet mass from the tagger’s output; the tagger is then trained to minimize the adversary’s ability to accurately predict mass. By decorrelating the tagger output from jet mass, the model becomes less reliant on this specific feature, leading to improved generalization when applied to datasets with variations in jet mass or generated by different underlying event simulations. This approach effectively encourages the tagger to focus on more robust and generalizable features of the jet structure, mitigating potential biases and improving overall performance across diverse datasets.

Evaluations of LundNet-based jet taggers have revealed a significant sensitivity to the underlying parton shower model used in simulation. Specifically, performance degradation of up to 40% has been observed when the tagger is tested with a parton shower model differing from the one used during training. This result underscores the necessity for developing robust jet taggers that are less dependent on specific simulation details and can generalize effectively across variations in the simulated environment. The observed performance drop indicates that current models can overfit to characteristics of the training simulation, limiting their applicability to real data or simulations employing different underlying assumptions.

Dynamic Architectures: A New Level of Jet Intelligence

The identification of particle jets – sprays of particles resulting from high-energy collisions – benefits from increasingly sophisticated analysis techniques. Recent advancements, specifically through the development of ParT and subsequently DeParT, introduce a dynamically enhanced particle transformer architecture. Unlike traditional methods that treat all particles within a jet equally, these models leverage attention mechanisms to prioritize the most relevant constituents. This dynamic focusing allows the network to effectively discern the underlying characteristics of the jet, such as whether it originated from a quark or a gluon. By adaptively weighting particle contributions, ParT and DeParT achieve a more nuanced understanding of jet structure, ultimately improving the precision of jet tagging and enabling more detailed studies of fundamental particle interactions.

DeParT represents a significant advancement in jet physics by directly processing Particle Flow Objects – that is, the reconstructed particles within a jet – rather than relying on higher-level jet features. This direct operation allows the model to learn more nuanced relationships between individual particles, leading to state-of-the-art performance in distinguishing between quark-initiated and gluon-initiated jets. By circumventing the need for hand-engineered features, DeParT’s attention mechanism can effectively identify the most salient particles for jet classification, achieving superior accuracy and robustness compared to previous methods. The result is a powerful tool for precisely identifying the origin of jets, crucial for unraveling the complex dynamics of particle collisions.

The recent advancements in jet physics, particularly with models like ParT and DeParT, highlight the remarkable capability of attention mechanisms to decipher the intricate relationships within particle jets. Unlike traditional methods that treat all jet constituents equally, these models selectively focus on the most pertinent particles, mirroring how physicists analyze collision events. This dynamic weighting, achieved through attention, allows the model to identify subtle patterns and correlations that would otherwise be obscured by the sheer volume of data. The success of these approaches isn’t simply about processing more information; it’s about intelligently prioritizing it, revealing the underlying structure of the jet and enabling more precise identification of the initiating particle – whether it be a quark or a gluon – and thus improving the overall understanding of fundamental particle interactions.

Recent advancements in jet tagging, spearheaded by the ATLAS Collaboration, have yielded substantial performance gains through the implementation of dynamically enhanced particle transformers. These methods demonstrate an approximate threefold improvement in the rejection of light jets-those originating from lighter quarks and gluons-significantly reducing background noise in high-energy physics analyses. Furthermore, the enhanced models also achieve improved rejection of gluon jets across a wide range of transverse momenta [latex]p_T[/latex], a crucial metric for precisely identifying and characterizing the fundamental particles produced in proton-proton collisions. This progress underscores the potential of attention-based machine learning to refine the identification of jets, ultimately contributing to more accurate measurements of fundamental particle properties and searches for new physics.

The DeParT algorithm [5] employs a hierarchical architecture to decompose tasks into manageable sub-problems, enabling efficient planning and execution.
The DeParT algorithm [5] employs a hierarchical architecture to decompose tasks into manageable sub-problems, enabling efficient planning and execution.

The pursuit of jet tagging, as detailed in this work, isn’t merely about categorizing particles; it’s a deliberate probing of the system’s boundaries. One seeks not just to confirm expected behaviors, but to expose the unexpected. As Lev Landau stated, ā€œThe only way to learn is to keep asking questions.ā€ This resonates deeply with the methodology employed-constituents are treated as individual signals, allowing constituent-based models, like transformers and graph neural networks, to reveal subtle patterns hidden within the complex data of hadronic jets. The focus on constituent-level information is less about finding quarks and gluons and more about understanding how their interactions manifest as observable jets, a reverse-engineering of fundamental forces at play.

What Lies Beyond the Jets?

The current enthusiasm for constituent-based models – transformers and graph neural networks, in particular – for jet tagging feels… predictable. It’s a natural extension of pattern recognition, but a little too neat. The algorithms excel at finding the signals, of course, but the fundamental question of what a ā€˜jet’ actually is remains largely untouched. These are, after all, proxies for quarks and gluons, smeared by the messy realities of hadronization and detection. The real challenge isn’t just better classification, but a deeper understanding of that mess – a deconstruction of the jet’s internal structure that moves beyond simply identifying its origin.

Future progress will likely necessitate a move beyond purely kinematic information. The models currently treat jets as collections of particles, but ignore the subtle interplay of strong force dynamics. Incorporating elements of effective field theory or even directly simulating hadronization within the tagging process – a computationally daunting task, admittedly – might unlock a new level of discrimination. The current techniques are excellent at identifying what is there, but remain remarkably silent on why it is there.

One suspects the ultimate limit isn’t algorithmic ingenuity, but the inherent ambiguity of the underlying physics. Perhaps a ā€˜perfect’ jet tag is an illusion, a horizon beyond which the signal dissolves into the irreducible uncertainty of quantum chromodynamics. That, of course, would be far more interesting than simply achieving 99.9% accuracy.


Original article: https://arxiv.org/pdf/2603.12306.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2026-03-16 13:55