{"title":"ALICE运行的快速熵编码","authors":"M. Lettrich","doi":"10.22323/1.390.0913","DOIUrl":null,"url":null,"abstract":"In LHC Run 3, the upgraded ALICE detector will record Pb-Pb collisions at a rate of 50 kHz using continuous readout. The resulting stream of raw data at 3.5 TB/s has to be processed with a set of lossy and lossless compression and data reduction techniques to a storage data rate of 90 GB/s while preserving relevant data for physics analysis. This contribution presents a custom lossless data compression scheme based on entropy coding as the final component in the data reduction chain which has to compress the data rate from 300 GB/s to 90 GB/s. A flexible, multi-process architecture for the data compression scheme is proposed that seamlessly interfaces with the data reduction algorithms of earlier stages and allows to use parallel processing in order to keep the required firm real-time guarantees of the system. The data processed inside the compression process have a structure that allows the use of an rANS entropy coder with more resource efficient static distribution tables. Extensions to the rANS entropy coder are introduced to efficiently work with these static distribution tables and large but sparse source alphabets consisting of up to 25 Bit per symbol. Preliminary performance results show compliance with the firm real-time requirements while offering close-to-optimal data compression.","PeriodicalId":20428,"journal":{"name":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fast Entropy Coding for ALICE Run 3\",\"authors\":\"M. Lettrich\",\"doi\":\"10.22323/1.390.0913\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In LHC Run 3, the upgraded ALICE detector will record Pb-Pb collisions at a rate of 50 kHz using continuous readout. The resulting stream of raw data at 3.5 TB/s has to be processed with a set of lossy and lossless compression and data reduction techniques to a storage data rate of 90 GB/s while preserving relevant data for physics analysis. This contribution presents a custom lossless data compression scheme based on entropy coding as the final component in the data reduction chain which has to compress the data rate from 300 GB/s to 90 GB/s. A flexible, multi-process architecture for the data compression scheme is proposed that seamlessly interfaces with the data reduction algorithms of earlier stages and allows to use parallel processing in order to keep the required firm real-time guarantees of the system. The data processed inside the compression process have a structure that allows the use of an rANS entropy coder with more resource efficient static distribution tables. Extensions to the rANS entropy coder are introduced to efficiently work with these static distribution tables and large but sparse source alphabets consisting of up to 25 Bit per symbol. Preliminary performance results show compliance with the firm real-time requirements while offering close-to-optimal data compression.\",\"PeriodicalId\":20428,\"journal\":{\"name\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22323/1.390.0913\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 40th International Conference on High Energy physics — PoS(ICHEP2020)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22323/1.390.0913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
在LHC Run 3中,升级后的ALICE探测器将以50 kHz的速率连续读出记录Pb-Pb碰撞。由此产生的3.5 TB/s的原始数据流必须通过一组有损和无损压缩和数据缩减技术进行处理,使存储数据速率达到90 GB/s,同时保留相关数据用于物理分析。本文提出了一种基于熵编码的自定义无损数据压缩方案,该方案作为数据缩减链的最后组成部分,必须将数据速率从300 GB/s压缩到90 GB/s。提出了一种灵活的多进程数据压缩方案架构,该架构与早期阶段的数据缩减算法无缝接口,并允许使用并行处理,以保持系统所需的坚固实时性保证。在压缩过程中处理的数据具有一种结构,该结构允许使用具有更高效资源的静态分布表的rANS熵编码器。引入了对rANS熵编码器的扩展,以有效地处理这些静态分布表和由每个符号最多25位组成的大型但稀疏的源字母表。初步的性能结果表明,在提供接近最佳的数据压缩的同时,符合公司的实时要求。
In LHC Run 3, the upgraded ALICE detector will record Pb-Pb collisions at a rate of 50 kHz using continuous readout. The resulting stream of raw data at 3.5 TB/s has to be processed with a set of lossy and lossless compression and data reduction techniques to a storage data rate of 90 GB/s while preserving relevant data for physics analysis. This contribution presents a custom lossless data compression scheme based on entropy coding as the final component in the data reduction chain which has to compress the data rate from 300 GB/s to 90 GB/s. A flexible, multi-process architecture for the data compression scheme is proposed that seamlessly interfaces with the data reduction algorithms of earlier stages and allows to use parallel processing in order to keep the required firm real-time guarantees of the system. The data processed inside the compression process have a structure that allows the use of an rANS entropy coder with more resource efficient static distribution tables. Extensions to the rANS entropy coder are introduced to efficiently work with these static distribution tables and large but sparse source alphabets consisting of up to 25 Bit per symbol. Preliminary performance results show compliance with the firm real-time requirements while offering close-to-optimal data compression.