Simon Schwitanski, Joachim Jenke, Felix Tomski, C. Terboven, Matthias S. Müller
{"title":"On-the-Fly Data Race Detection for MPI RMA Programs with MUST","authors":"Simon Schwitanski, Joachim Jenke, Felix Tomski, C. Terboven, Matthias S. Müller","doi":"10.1109/Correctness56720.2022.00009","DOIUrl":null,"url":null,"abstract":"MPI Remote Memory Access (RMA) provides a one-sided communication model for MPI applications. Ensuring consistency between RMA operations with synchronization calls is a key requirement when writing correct RMA codes. Wrong API usage may lead to concurrent modifications of the same memory location without proper synchronization resulting in data races across processes. Due to their non-deterministic nature, such data races are hard to detect. This paper presents MUST-RMA, an on-the-fly data race detector for MPI RMA applications. MUST-RMA uses a race detection model based on happened-before and consistency analysis. It combines the MPI correctness tool MUST with the race detector ThreadSanitizer to detect races across processes in RMA applications. A classification quality study on MUST-RMA with different test cases shows a precision and recall of 0.95. An overhead study on a stencil and a matrix transpose kernel shows runtime slowdowns of 3x to 20x for up to 192 processes.","PeriodicalId":211482,"journal":{"name":"2022 IEEE/ACM Sixth International Workshop on Software Correctness for HPC Applications (Correctness)","volume":"871 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM Sixth International Workshop on Software Correctness for HPC Applications (Correctness)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Correctness56720.2022.00009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
MPI Remote Memory Access (RMA) provides a one-sided communication model for MPI applications. Ensuring consistency between RMA operations with synchronization calls is a key requirement when writing correct RMA codes. Wrong API usage may lead to concurrent modifications of the same memory location without proper synchronization resulting in data races across processes. Due to their non-deterministic nature, such data races are hard to detect. This paper presents MUST-RMA, an on-the-fly data race detector for MPI RMA applications. MUST-RMA uses a race detection model based on happened-before and consistency analysis. It combines the MPI correctness tool MUST with the race detector ThreadSanitizer to detect races across processes in RMA applications. A classification quality study on MUST-RMA with different test cases shows a precision and recall of 0.95. An overhead study on a stencil and a matrix transpose kernel shows runtime slowdowns of 3x to 20x for up to 192 processes.