MC-CChecker:一种基于时钟的方法来检测MPI单边应用程序中的内存一致性错误

Thanh-Dang Diep, K. Fürlinger, N. Thoai
{"title":"MC-CChecker:一种基于时钟的方法来检测MPI单边应用程序中的内存一致性错误","authors":"Thanh-Dang Diep, K. Fürlinger, N. Thoai","doi":"10.1145/3236367.3236369","DOIUrl":null,"url":null,"abstract":"MPI one-sided communication decouples data movement from synchronization, which eliminates overhead from unneeded synchronization and allows for greater concurrency. On the one hand this fact is the great advantage of MPI one-sided communication, but on the other, it poses enormous challenges for programmers in preserving the reliability of programs. Memory consistency errors are notorious for degrading reliability as well as performance of MPI one-sided applications. Even an MPI expert can easily make these mistakes. The lockopts bug occurred in an RMA test case that is part of MPICH MPI implementation is an example for this situation. Hence, detecting memory consistency errors is extremely challenging. MC-Checker is the most cutting-edge debugger to address these errors effectively. MC-Checker tackles the memory consistency errors based on the happened-before relation. Taking full advantage of the relation makes DN-Analyzer of MC-Checker difficult to scale well. For that reason, MC-Checker does ignore the transitive ordering of the happened-before relation to retain scalability of DN-Analyzer. Consequently, MC-Checker is highly able to impose a potential source of false positives. In order to overcome this issue, we present a novel clock-based approach called MC-CChecker with the aim of fully preserving the happened-before relation by making use of an encoded vector clock. MC-CChecker inherits distinguishing features from MC-Checker by reusing ST-Analyzer and Profiler while focusing mainly on the optimization of DN-Analyzer. The experimental findings prove that MC-CChecker not only effectively detects memory consistency errors as MC-Checker did, but also completely eliminates the potential source of false positives which is a major limitation of MC-Checker while still retaining acceptable overheads of execution time and memory usage for DN-Analyzer. Especially, DN-Analyzer of MC-CChecker is fairly scalable when processing a large amount of trace files generated from running the lockopts up to 8192 processes.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications\",\"authors\":\"Thanh-Dang Diep, K. Fürlinger, N. Thoai\",\"doi\":\"10.1145/3236367.3236369\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MPI one-sided communication decouples data movement from synchronization, which eliminates overhead from unneeded synchronization and allows for greater concurrency. On the one hand this fact is the great advantage of MPI one-sided communication, but on the other, it poses enormous challenges for programmers in preserving the reliability of programs. Memory consistency errors are notorious for degrading reliability as well as performance of MPI one-sided applications. Even an MPI expert can easily make these mistakes. The lockopts bug occurred in an RMA test case that is part of MPICH MPI implementation is an example for this situation. Hence, detecting memory consistency errors is extremely challenging. MC-Checker is the most cutting-edge debugger to address these errors effectively. MC-Checker tackles the memory consistency errors based on the happened-before relation. Taking full advantage of the relation makes DN-Analyzer of MC-Checker difficult to scale well. For that reason, MC-Checker does ignore the transitive ordering of the happened-before relation to retain scalability of DN-Analyzer. Consequently, MC-Checker is highly able to impose a potential source of false positives. In order to overcome this issue, we present a novel clock-based approach called MC-CChecker with the aim of fully preserving the happened-before relation by making use of an encoded vector clock. MC-CChecker inherits distinguishing features from MC-Checker by reusing ST-Analyzer and Profiler while focusing mainly on the optimization of DN-Analyzer. The experimental findings prove that MC-CChecker not only effectively detects memory consistency errors as MC-Checker did, but also completely eliminates the potential source of false positives which is a major limitation of MC-Checker while still retaining acceptable overheads of execution time and memory usage for DN-Analyzer. Especially, DN-Analyzer of MC-CChecker is fairly scalable when processing a large amount of trace files generated from running the lockopts up to 8192 processes.\",\"PeriodicalId\":225539,\"journal\":{\"name\":\"Proceedings of the 25th European MPI Users' Group Meeting\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th European MPI Users' Group Meeting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3236367.3236369\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3236367.3236369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

MPI单侧通信将数据移动与同步分离,从而消除了不必要的同步带来的开销,并允许更高的并发性。这一方面是MPI单面通信的巨大优势,但另一方面也给程序员在保证程序可靠性方面带来了巨大的挑战。内存一致性错误因降低MPI单侧应用程序的可靠性和性能而臭名昭著。即使是MPI专家也很容易犯这些错误。lockopt错误发生在MPICH的一部分RMA测试用例中,MPI实现就是这种情况的一个例子。因此,检测内存一致性错误极具挑战性。MC-Checker是最先进的调试器,可以有效地解决这些错误。MC-Checker根据happens -before关系处理内存一致性错误。充分利用这种关系使得MC-Checker的DN-Analyzer难以很好地扩展。出于这个原因,MC-Checker确实忽略了happens -before关系的传递顺序,以保持DN-Analyzer的可伸缩性。因此,MC-Checker是高度能够施加假阳性的潜在来源。为了克服这个问题,我们提出了一种新的基于时钟的方法,称为MC-CChecker,其目的是通过使用编码矢量时钟来完全保留之前发生的关系。MC-CChecker继承了MC-Checker的特点,通过重用ST-Analyzer和Profiler,同时主要侧重于优化DN-Analyzer。实验结果证明,MC-CChecker不仅可以像MC-Checker那样有效地检测内存一致性错误,而且完全消除了MC-Checker的主要限制误报的潜在来源,同时仍然保留了DN-Analyzer可接受的执行时间开销和内存使用开销。特别是,MC-CChecker的DN-Analyzer在处理从运行lockopt(多达8192个进程)生成的大量跟踪文件时具有相当的可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications
MPI one-sided communication decouples data movement from synchronization, which eliminates overhead from unneeded synchronization and allows for greater concurrency. On the one hand this fact is the great advantage of MPI one-sided communication, but on the other, it poses enormous challenges for programmers in preserving the reliability of programs. Memory consistency errors are notorious for degrading reliability as well as performance of MPI one-sided applications. Even an MPI expert can easily make these mistakes. The lockopts bug occurred in an RMA test case that is part of MPICH MPI implementation is an example for this situation. Hence, detecting memory consistency errors is extremely challenging. MC-Checker is the most cutting-edge debugger to address these errors effectively. MC-Checker tackles the memory consistency errors based on the happened-before relation. Taking full advantage of the relation makes DN-Analyzer of MC-Checker difficult to scale well. For that reason, MC-Checker does ignore the transitive ordering of the happened-before relation to retain scalability of DN-Analyzer. Consequently, MC-Checker is highly able to impose a potential source of false positives. In order to overcome this issue, we present a novel clock-based approach called MC-CChecker with the aim of fully preserving the happened-before relation by making use of an encoded vector clock. MC-CChecker inherits distinguishing features from MC-Checker by reusing ST-Analyzer and Profiler while focusing mainly on the optimization of DN-Analyzer. The experimental findings prove that MC-CChecker not only effectively detects memory consistency errors as MC-Checker did, but also completely eliminates the potential source of false positives which is a major limitation of MC-Checker while still retaining acceptable overheads of execution time and memory usage for DN-Analyzer. Especially, DN-Analyzer of MC-CChecker is fairly scalable when processing a large amount of trace files generated from running the lockopts up to 8192 processes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信