{"title":"Efficient monitoring to detect wireless channel failures for MPI programs","authors":"E. Macías, Álvaro Suárez Sarmiento, V. Sunderam","doi":"10.1109/EMPDP.2004.1271469","DOIUrl":null,"url":null,"abstract":"In the last few years the use of wireless technology has increased by leaps and bounds and as a result powerful portable computers with wireless cards are viable nodes in parallel distributed computing. In this scenario it is natural to consider the possibility of frequent failures in the wireless channel. In MPI programs, such wireless network behavior is reflected as communication failure. Although the MPI standard does not handle failures, there are some projects that address this issue. To the best of our knowledge there is no previous work that presents a practical solution for fault-handling in MPI programs that run on wireless environments. We present a mechanism at the application level, that combined with wireless network monitoring software detects these failures and warns MPI applications to enable them to take appropriate action.","PeriodicalId":105726,"journal":{"name":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2004-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMPDP.2004.1271469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In the last few years the use of wireless technology has increased by leaps and bounds and as a result powerful portable computers with wireless cards are viable nodes in parallel distributed computing. In this scenario it is natural to consider the possibility of frequent failures in the wireless channel. In MPI programs, such wireless network behavior is reflected as communication failure. Although the MPI standard does not handle failures, there are some projects that address this issue. To the best of our knowledge there is no previous work that presents a practical solution for fault-handling in MPI programs that run on wireless environments. We present a mechanism at the application level, that combined with wireless network monitoring software detects these failures and warns MPI applications to enable them to take appropriate action.