P. Czarnul, Grzegorz Golaszewski, Grzegorz Jereczek, M. Maciejewski
{"title":"Development and benchmarking a parallel Data AcQuisition framework using MPI with hash and hash+tree structures in a cluster environment","authors":"P. Czarnul, Grzegorz Golaszewski, Grzegorz Jereczek, M. Maciejewski","doi":"10.1109/ISPDC51135.2020.00031","DOIUrl":null,"url":null,"abstract":"In the paper we propose a solution that uses either a 3-layered index structure based on hash tables or a hash+tree structure for efficient parallel processing of data in a Data AcQuisition (DAQ) system. The proposed framework allows for parallel data writes from multiple multithreaded client processes to multiple multithreaded server processes that use a thread-safe hash-table-based library. Communication is conducted using an MPI_THREAD_MULTIPLE enabled MPI implementation. We demonstrate that the solution scales well in two cluster configurations using InfiniBand, specifically for increasing numbers of client as well as server threads. We present how performance depends on various configuration parameters of a DAQ systems like data distribution across the readout system, its size, and percentage of data to be fetched. Furthermore, we show how it depends on the size of value associated with a given write/read. We compare the performance of both proposed data structures for different configurations. The results allow the reader to learn real performance numbers and characteristics of such a solution, applicable to large scale parallel data processing in a DAQ system and choose the optimal solution.","PeriodicalId":426824,"journal":{"name":"2020 19th International Symposium on Parallel and Distributed Computing (ISPDC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 19th International Symposium on Parallel and Distributed Computing (ISPDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC51135.2020.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the paper we propose a solution that uses either a 3-layered index structure based on hash tables or a hash+tree structure for efficient parallel processing of data in a Data AcQuisition (DAQ) system. The proposed framework allows for parallel data writes from multiple multithreaded client processes to multiple multithreaded server processes that use a thread-safe hash-table-based library. Communication is conducted using an MPI_THREAD_MULTIPLE enabled MPI implementation. We demonstrate that the solution scales well in two cluster configurations using InfiniBand, specifically for increasing numbers of client as well as server threads. We present how performance depends on various configuration parameters of a DAQ systems like data distribution across the readout system, its size, and percentage of data to be fetched. Furthermore, we show how it depends on the size of value associated with a given write/read. We compare the performance of both proposed data structures for different configurations. The results allow the reader to learn real performance numbers and characteristics of such a solution, applicable to large scale parallel data processing in a DAQ system and choose the optimal solution.