提炼粒子知识，促进高能物理实验的快速重建

IF 4.6 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2024-05-06 DOI:10.1088/2632-2153/ad43b1

A Bal, T Brandes, F Iemmi, M Klute, B Maier, V Mikuni and T K Årrestad

{"title":"提炼粒子知识，促进高能物理实验的快速重建","authors":"A Bal, T Brandes, F Iemmi, M Klute, B Maier, V Mikuni and T K Årrestad","doi":"10.1088/2632-2153/ad43b1","DOIUrl":null,"url":null,"abstract":"Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"15 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distilling particle knowledge for fast reconstruction at high-energy physics experiments\",\"authors\":\"A Bal, T Brandes, F Iemmi, M Klute, B Maier, V Mikuni and T K Årrestad\",\"doi\":\"10.1088/2632-2153/ad43b1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages.\",\"PeriodicalId\":33757,\"journal\":{\"name\":\"Machine Learning Science and Technology\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ad43b1\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad43b1","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

知识蒸馏是一种模型压缩形式，它允许不同规模的人工神经网络相互学习。它的主要应用是压缩大型深度神经网络，以释放计算资源，特别是在边缘设备上。在本文中，我们考虑了高亮度大型强子对撞机（HL-LHC）的质子-质子对撞，并演示了从事件级图神经网络（GNN）到粒子级小型深度神经网络（DNN）的成功知识转移。我们的算法 DistillNet 是一种 DNN，经过训练后可以学习粒子的来源（由作为 GNN 输出的软标签提供），从而预测粒子是否来自主要相互作用顶点。结果表明，对于这个问题（这是 HL-LHC 面临的主要挑战之一），在向小型学生网络传输知识的过程中，损失极小，而与教师相比，计算资源需求却有显著改善。这一点在中央处理器上的经过提炼的学生网络，以及部署在现场可编程门阵列上的经过量化和剪枝的学生网络上都得到了证明。我们的研究证明，不同复杂度网络之间的知识转移可用于高能物理领域的快速人工智能（AI），与非基于人工智能的重构算法相比，可提高观测数据的表现力。这种方法在大型强子对撞机（HL-LHC）实验中至关重要，例如，可以满足其触发阶段的资源预算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Distilling particle knowledge for fast reconstruction at high-energy physics experiments

Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational resources, in particular on edge devices. In this article, we consider proton-proton collisions at the High-Luminosity Large Hadron Collider (HL-LHC) and demonstrate a successful knowledge transfer from an event-level graph neural network (GNN) to a particle-level small deep neural network (DNN). Our algorithm, DistillNet, is a DNN that is trained to learn about the provenance of particles, as provided by the soft labels that are the GNN outputs, to predict whether or not a particle originates from the primary interaction vertex. The results indicate that for this problem, which is one of the main challenges at the HL-LHC, there is minimal loss during the transfer of knowledge to the small student network, while improving significantly the computational resource needs compared to the teacher. This is demonstrated for the distilled student network on a CPU, as well as for a quantized and pruned student network deployed on an field programmable gate array. Our study proves that knowledge transfer between networks of different complexity can be used for fast artificial intelligence (AI) in high-energy physics that improves the expressiveness of observables over non-AI-based reconstruction algorithms. Such an approach can become essential at the HL-LHC experiments, e.g. to comply with the resource budget of their trigger stages.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.