对在ALICE O系统中传输大数据量的消息队列库和网络技术进行基准测试

V. C. Barroso, U. Fuchs, A. Wegrzynek
{"title":"对在ALICE O系统中传输大数据量的消息队列库和网络技术进行基准测试","authors":"V. C. Barroso, U. Fuchs, A. Wegrzynek","doi":"10.1109/RTC.2016.7543162","DOIUrl":null,"url":null,"abstract":"ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). ALICE has been successfully collecting physics data of Run 2 since spring 2015. In parallel, preparations for a major upgrade, called O2 (Online-Offline) and scheduled for the Long Shutdown 2 in 2019-2020, are being made. One of the major requirements is the capacity to transport data between so-called FLPs (First Level Processors), equipped with readout cards, and the EPNs (Event Processing Node), performing data aggregation, frame building and partial reconstruction. It is foreseen to have 268 FLPs dispatching data to 1500 EPNs with an average output of 20 Gb/s each. In overall, the O2 processing system will operate at terabits per second of throughput while handling millions of concurrent connections. To meet these requirements, the software and hardware layers of the new system need to be fully evaluated. In order to achieve a high performance to cost ratio three networking technologies (Ethernet, InfiniBand and Omni-Path) were benchmarked on Intel and IBM platforms. The core of the new transport layer will be based on a message queue library that supports push-pull and request-reply communication patterns and multipart messages. ZeroMQ and nanomsg are being evaluated as candidates and were tested in detail over the selected network technologies. This paper describes the benchmark programs and setups that were used during the tests, the significance of tuned kernel parameters, the configuration of network driver and the tuning of multi-core, multi-CPU, and NUMA (Non-Uniform Memory Access) architecture. It presents, compares and comments the final results. Eventually, it indicates the most efficient network technology and message queue library pair and provides an evaluation of the needed CPU and memory resources to handle foreseen traffic.","PeriodicalId":383702,"journal":{"name":"2016 IEEE-NPSS Real Time Conference (RT)","volume":"297 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O system\",\"authors\":\"V. C. Barroso, U. Fuchs, A. Wegrzynek\",\"doi\":\"10.1109/RTC.2016.7543162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). ALICE has been successfully collecting physics data of Run 2 since spring 2015. In parallel, preparations for a major upgrade, called O2 (Online-Offline) and scheduled for the Long Shutdown 2 in 2019-2020, are being made. One of the major requirements is the capacity to transport data between so-called FLPs (First Level Processors), equipped with readout cards, and the EPNs (Event Processing Node), performing data aggregation, frame building and partial reconstruction. It is foreseen to have 268 FLPs dispatching data to 1500 EPNs with an average output of 20 Gb/s each. In overall, the O2 processing system will operate at terabits per second of throughput while handling millions of concurrent connections. To meet these requirements, the software and hardware layers of the new system need to be fully evaluated. In order to achieve a high performance to cost ratio three networking technologies (Ethernet, InfiniBand and Omni-Path) were benchmarked on Intel and IBM platforms. The core of the new transport layer will be based on a message queue library that supports push-pull and request-reply communication patterns and multipart messages. ZeroMQ and nanomsg are being evaluated as candidates and were tested in detail over the selected network technologies. This paper describes the benchmark programs and setups that were used during the tests, the significance of tuned kernel parameters, the configuration of network driver and the tuning of multi-core, multi-CPU, and NUMA (Non-Uniform Memory Access) architecture. It presents, compares and comments the final results. Eventually, it indicates the most efficient network technology and message queue library pair and provides an evaluation of the needed CPU and memory resources to handle foreseen traffic.\",\"PeriodicalId\":383702,\"journal\":{\"name\":\"2016 IEEE-NPSS Real Time Conference (RT)\",\"volume\":\"297 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE-NPSS Real Time Conference (RT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTC.2016.7543162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE-NPSS Real Time Conference (RT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTC.2016.7543162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

ALICE(大型离子对撞机实验)是欧洲核子研究中心(CERN)大型强子对撞机(LHC)设计的用于研究强相互作用物质和夸克-胶子等离子体物理的重离子探测器。自2015年春季以来,ALICE已经成功地收集了Run 2的物理数据。与此同时,一项名为O2(在线-离线)的重大升级的准备工作也在进行中,该升级计划于2019-2020年长期停工。其中一个主要要求是在配备读出卡的所谓的flp(一级处理器)和epn(事件处理节点)之间传输数据的能力,执行数据聚合、框架构建和部分重建。预计将有268个flp将数据分发到1500个epn,每个epn的平均输出为20 Gb/s。总的来说,O2处理系统将以每秒太比特的吞吐量运行,同时处理数百万个并发连接。为了满足这些要求,需要对新系统的软件和硬件层进行充分的评估。为了实现高性能成本比,三种网络技术(以太网,InfiniBand和Omni-Path)在英特尔和IBM平台上进行了基准测试。新传输层的核心将基于消息队列库,该库支持推拉和请求-应答通信模式以及多部分消息。ZeroMQ和nanomsg正在作为候选网络进行评估,并在选定的网络技术上进行了详细测试。本文介绍了测试过程中使用的基准程序和设置,调优内核参数的意义,网络驱动程序的配置以及多核、多cpu和NUMA(非统一内存访问)体系结构的调优。它展示、比较和评论了最终结果。最后,它指出了最有效的网络技术和消息队列库对,并提供了处理可预见流量所需的CPU和内存资源的评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Benchmarking message queue libraries and network technologies to transport large data volume in the ALICE O system
ALICE (A Large Ion Collider Experiment) is the heavy-ion detector designed to study the physics of strongly interacting matter and the quark-gluon plasma at the CERN LHC (Large Hadron Collider). ALICE has been successfully collecting physics data of Run 2 since spring 2015. In parallel, preparations for a major upgrade, called O2 (Online-Offline) and scheduled for the Long Shutdown 2 in 2019-2020, are being made. One of the major requirements is the capacity to transport data between so-called FLPs (First Level Processors), equipped with readout cards, and the EPNs (Event Processing Node), performing data aggregation, frame building and partial reconstruction. It is foreseen to have 268 FLPs dispatching data to 1500 EPNs with an average output of 20 Gb/s each. In overall, the O2 processing system will operate at terabits per second of throughput while handling millions of concurrent connections. To meet these requirements, the software and hardware layers of the new system need to be fully evaluated. In order to achieve a high performance to cost ratio three networking technologies (Ethernet, InfiniBand and Omni-Path) were benchmarked on Intel and IBM platforms. The core of the new transport layer will be based on a message queue library that supports push-pull and request-reply communication patterns and multipart messages. ZeroMQ and nanomsg are being evaluated as candidates and were tested in detail over the selected network technologies. This paper describes the benchmark programs and setups that were used during the tests, the significance of tuned kernel parameters, the configuration of network driver and the tuning of multi-core, multi-CPU, and NUMA (Non-Uniform Memory Access) architecture. It presents, compares and comments the final results. Eventually, it indicates the most efficient network technology and message queue library pair and provides an evaluation of the needed CPU and memory resources to handle foreseen traffic.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信