Proceedings of the 25th European MPI Users' Group Meeting最新文献

Proceedings of the 25th European MPI Users' Group Meeting 第25届欧洲MPI用户小组会议论文集

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367

引用次数: 0

Transparent High-Speed Network Checkpoint/Restart in MPI 透明高速网络检查点/重启在MPI

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236383

Julien Adam, Jean-Baptiste Besnard, A. Malony, S. Shende, Marc Pérache, Patrick Carribault, Julien Jaeger

{"title":"Transparent High-Speed Network Checkpoint/Restart in MPI","authors":"Julien Adam, Jean-Baptiste Besnard, A. Malony, S. Shende, Marc Pérache, Patrick Carribault, Julien Jaeger","doi":"10.1145/3236367.3236383","DOIUrl":"https://doi.org/10.1145/3236367.3236383","url":null,"abstract":"Fault-tolerance has always been an important topic when it comes to running massively parallel programs at scale. Statistically, hardware and software failures are expected to occur more often on systems gathering millions of computing units. Moreover, the larger jobs are, the more computing hours would be wasted by a crash. In this paper, we describe the work done in our MPI runtime to enable transparent checkpointing mechanism. Unlike the MPI 4.0 User-Level Failure Mitigation (ULFM) interface, our work targets solely Checkpoint/Restart (C/R) and ignores wider features such as resiliency. We show how existing transparent checkpointing methods can be practically applied to MPI implementations given a sufficient collaboration from the MPI runtime. Our C/R technique is then measured on MPI benchmarks such as IMB and Lulesh relying on Infiniband high-speed network, demonstrating that the chosen approach is sufficiently general and that performance is mostly preserved. We argue that enabling fault-tolerance without any modification inside target MPI applications is possible, and show how it could be the first step for more integrated resiliency combined with failure mitigation like ULFM.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127113328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

MPI Stages: Checkpointing MPI State for Bulk Synchronous Applications MPI阶段:批量同步应用程序的MPI状态检查点

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236385

Nawrin Sultana, A. Skjellum, I. Laguna, M. Farmer, K. Mohror, M. Emani

{"title":"MPI Stages: Checkpointing MPI State for Bulk Synchronous Applications","authors":"Nawrin Sultana, A. Skjellum, I. Laguna, M. Farmer, K. Mohror, M. Emani","doi":"10.1145/3236367.3236385","DOIUrl":"https://doi.org/10.1145/3236367.3236385","url":null,"abstract":"When an MPI program experiences a failure, the most common recovery approach is to restart all processes from a previous checkpoint and to re-queue the entire job. A disadvantage of this method is that, although the failure occurred within the main application loop, live processes must start again from the beginning of the program, along with new replacement processes---this incurs unnecessary overhead for live processes. To avoid such overheads and concomitant delays, we introduce the concept of \"MPI Stages.\" MPI Stages saves internal MPI state in a separate checkpoint in conjunction with application state. Upon failure, both MPI and application state are recovered, respectively, from their last synchronous checkpoints and continue without restarting the overall MPI job. Live processes roll back only a few iterations within the main loop instead of rolling back to the beginning of the program, while a replacement of failed process restarts and reintegrates, thereby achieving faster failure recovery. This approach integrates well with large-scale, bulk synchronous applications and checkpoint/restart. In this paper, we identify requirements for production MPI implementations to support state checkpointing with MPI Stages, which includes capturing and managing internal MPI state and serializing and deserializing user handles to MPI objects. We evaluate our fault tolerance approach with a proof-of-concept prototype MPI implementation that includes MPI Stages. We demonstrate its functionality and performance using LULESH and microbenchmarks. Our results show that MPI Stages reduces the recovery time by 13× for LULESH in comparison to checkpoint/restart.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132998228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Energy-efficient localised rollback via data flow analysis and frequency scaling 通过数据流分析和频率缩放实现节能的局部回滚

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236379

K. Dichev, K. Cameron, Dimitrios S. Nikolopoulos

引用次数: 8

Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth 具有最优带宽的全双工组间所有对所有广播算法

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236374

Qiao Kang, J. Träff, Reda Al-Bahrani, Ankit Agrawal, A. Choudhary, W. Liao

引用次数: 4

Enabling callback-driven runtime introspection via MPI_T 通过MPI_T启用回调驱动的运行时内省

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236370

Marc-André Hermanns, N. Hjelm, Michael Knobloch, K. Mohror, M. Schulz

引用次数: 7

Using Node Information to Implement MPI Cartesian Topologies 利用节点信息实现MPI笛卡尔拓扑

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236377

W. Gropp

引用次数: 9

MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications MC-CChecker:一种基于时钟的方法来检测MPI单边应用程序中的内存一致性错误

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236369

Thanh-Dang Diep, K. Fürlinger, N. Thoai

{"title":"MC-CChecker: A Clock-Based Approach to Detect Memory Consistency Errors in MPI One-Sided Applications","authors":"Thanh-Dang Diep, K. Fürlinger, N. Thoai","doi":"10.1145/3236367.3236369","DOIUrl":"https://doi.org/10.1145/3236367.3236369","url":null,"abstract":"MPI one-sided communication decouples data movement from synchronization, which eliminates overhead from unneeded synchronization and allows for greater concurrency. On the one hand this fact is the great advantage of MPI one-sided communication, but on the other, it poses enormous challenges for programmers in preserving the reliability of programs. Memory consistency errors are notorious for degrading reliability as well as performance of MPI one-sided applications. Even an MPI expert can easily make these mistakes. The lockopts bug occurred in an RMA test case that is part of MPICH MPI implementation is an example for this situation. Hence, detecting memory consistency errors is extremely challenging. MC-Checker is the most cutting-edge debugger to address these errors effectively. MC-Checker tackles the memory consistency errors based on the happened-before relation. Taking full advantage of the relation makes DN-Analyzer of MC-Checker difficult to scale well. For that reason, MC-Checker does ignore the transitive ordering of the happened-before relation to retain scalability of DN-Analyzer. Consequently, MC-Checker is highly able to impose a potential source of false positives. In order to overcome this issue, we present a novel clock-based approach called MC-CChecker with the aim of fully preserving the happened-before relation by making use of an encoded vector clock. MC-CChecker inherits distinguishing features from MC-Checker by reusing ST-Analyzer and Profiler while focusing mainly on the optimization of DN-Analyzer. The experimental findings prove that MC-CChecker not only effectively detects memory consistency errors as MC-Checker did, but also completely eliminates the potential source of false positives which is a major limitation of MC-Checker while still retaining acceptable overheads of execution time and memory usage for DN-Analyzer. Especially, DN-Analyzer of MC-CChecker is fairly scalable when processing a large amount of trace files generated from running the lockopts up to 8192 processes.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130128802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

MPI+OpenMP Tasking Scalability for the Simulation of the Human Brain: Human Brain Project 人脑模拟的MPI+OpenMP任务可扩展性:人脑计划

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236373

Pedro Valero-Lara, R. Sirvent, Antonio J. Peña, X. Martorell, Jesús Labarta

{"title":"MPI+OpenMP Tasking Scalability for the Simulation of the Human Brain: Human Brain Project","authors":"Pedro Valero-Lara, R. Sirvent, Antonio J. Peña, X. Martorell, Jesús Labarta","doi":"10.1145/3236367.3236373","DOIUrl":"https://doi.org/10.1145/3236367.3236373","url":null,"abstract":"The simulation of the behavior of the Human Brain is one of the most ambitious challenges today with a non-end of important applications. We can find many different initiatives in the USA, Europe and Japan which attempt to achieve such a challenging target. In this work we focus on the most important European initiative (Human Brain Project) and on one of the tools (Arbor). This tool simulates the spikes triggered in a neuronal network by computing the voltage capacitance on the neurons' morphology, being one of the most precise simulators today. In the present work, we have evaluated the use of MPI+OpenMP tasking on top of the Arbor simulator. In this paper, we present the main characteristics of the Arbor tool and how these can be efficiently managed by using MPI+OpenMP tasking. We prove that this approach is able to achieve a good scaling even when computing a relatively low workload (number of neurons) per node using up to 32 nodes. Our target consists of achieving not only a highly scalable implementation based on MPI, but also to develop a tool with a high degree of abstraction without losing control and performance by using MPI+OpenMP tasking.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116722451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures 基于KNL和POWER架构的多线程和无锁MPI RMA图处理

Proceedings of the 25th European MPI Users' Group Meeting Pub Date : 2018-09-23 DOI: 10.1145/3236367.3236371

Mingzhe Li, Xiaoyi Lu, H. Subramoni, D. Panda

{"title":"Multi-Threading and Lock-Free MPI RMA Based Graph Processing on KNL and POWER Architectures","authors":"Mingzhe Li, Xiaoyi Lu, H. Subramoni, D. Panda","doi":"10.1145/3236367.3236371","DOIUrl":"https://doi.org/10.1145/3236367.3236371","url":null,"abstract":"Intel Knights Landing (KNL) and IBM POWER architectures are becoming widely deployed on modern supercomputing systems due to its powerful components. MPI Remote Memory Access (RMA) model that provides one-sided communication semantics has been seen as an attractive approach for developing High-Performance Data Analytics (HPDA) applications such as graph processing with irregular communication characteristics. To take advantage of a large number of hardware threads offered by KNL and POWER, HPDA applications and MPI RMA runtime need to be re-designed to get optimal performance. In this paper, we propose multi-threading and lock-free designs in the MPI runtime as well as Graph500 application on KNL and POWER architectures. At the micro-bench level, our proposed runtime-level designs are able to reduce the latency of uni-directional MPI_Put and MPI_Get by up to 3X compared to IntelMPI and Spectrum MPI. At the application level, with 1,024 processes on 32 KNL nodes, our proposed design could outperform IntelMPI library by 32%. With 512 processes on eight POWER nodes, our proposed design could outperform Spectrum MPI library by 19%. To the best of our knowledge, this is the first paper to design and evaluate MPI RMA-based graph processing applications on KNL and POWER architectures.","PeriodicalId":225539,"journal":{"name":"Proceedings of the 25th European MPI Users' Group Meeting","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114339644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1