Accelerating Machine Learning Inference with GPUs in ProtoDUNE Data Processing.

Q1 Computer Science

Computing and Software for Big Science Pub Date : 2023-01-01 Epub Date: 2023-10-27 DOI:10.1007/s41781-023-00101-0

Tejin Cai, Kenneth Herner, Tingjun Yang, Michael Wang, Maria Acosta Flechas, Philip Harris, Burt Holzman, Kevin Pedro, Nhan Tran

{"title":"Accelerating Machine Learning Inference with GPUs in ProtoDUNE Data Processing.","authors":"Tejin Cai, Kenneth Herner, Tingjun Yang, Michael Wang, Maria Acosta Flechas, Philip Harris, Burt Holzman, Kevin Pedro, Nhan Tran","doi":"10.1007/s41781-023-00101-0","DOIUrl":null,"url":null,"abstract":"<p><p>We study the performance of a cloud-based GPU-accelerated inference server to speed up event reconstruction in neutrino data batch jobs. Using detector data from the ProtoDUNE experiment and employing the standard DUNE grid job submission tools, we attempt to reprocess the data by running several thousand concurrent grid jobs, a rate we expect to be typical of current and future neutrino physics experiments. We process most of the dataset with the GPU version of our processing algorithm and the remainder with the CPU version for timing comparisons. We find that a 100-GPU cloud-based server is able to easily meet the processing demand, and that using the GPU version of the event processing algorithm is two times faster than processing these data with the CPU version when comparing to the newest CPUs in our sample. The amount of data transferred to the inference server during the GPU runs can overwhelm even the highest-bandwidth network switches, however, unless care is taken to observe network facility limits or otherwise distribute the jobs to multiple sites. We discuss the lessons learned from this processing campaign and several avenues for future improvements.</p>","PeriodicalId":36026,"journal":{"name":"Computing and Software for Big Science","volume":"7 1","pages":"11"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10611601/pdf/","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computing and Software for Big Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41781-023-00101-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 1

Abstract

We study the performance of a cloud-based GPU-accelerated inference server to speed up event reconstruction in neutrino data batch jobs. Using detector data from the ProtoDUNE experiment and employing the standard DUNE grid job submission tools, we attempt to reprocess the data by running several thousand concurrent grid jobs, a rate we expect to be typical of current and future neutrino physics experiments. We process most of the dataset with the GPU version of our processing algorithm and the remainder with the CPU version for timing comparisons. We find that a 100-GPU cloud-based server is able to easily meet the processing demand, and that using the GPU version of the event processing algorithm is two times faster than processing these data with the CPU version when comparing to the newest CPUs in our sample. The amount of data transferred to the inference server during the GPU runs can overwhelm even the highest-bandwidth network switches, however, unless care is taken to observe network facility limits or otherwise distribute the jobs to multiple sites. We discuss the lessons learned from this processing campaign and several avenues for future improvements.

Abstract Image

查看原文本刊更多论文

在ProtoDUNE数据处理中使用GPU加速机器学习推理。

我们研究了基于云的GPU加速推理服务器在中微子数据批处理作业中加速事件重建的性能。使用ProtoDUNE实验的探测器数据，并使用标准的DUNE网格作业提交工具，我们试图通过运行数千个并发网格作业来重新处理数据，我们预计这一速率将是当前和未来中微子物理实验的典型速率。我们用GPU版本的处理算法处理大部分数据集，用CPU版本处理其余数据集进行时间比较。我们发现，基于100-GPU云的服务器能够轻松满足处理需求，与我们样本中最新的CPU相比，使用GPU版本的事件处理算法处理这些数据的速度是使用CPU版本的两倍。然而，除非注意遵守网络设施限制或以其他方式将作业分配到多个站点，否则在GPU运行期间传输到推理服务器的数据量甚至可能超过最高带宽的网络交换机。我们讨论了从这次处理活动中吸取的经验教训以及未来改进的几种途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computing and Software for Big Science Computer Science-Computer Science (miscellaneous)

CiteScore

6.20

自引率

0.00%

发文量