{"title":"An Improved Straggler Identification Scheme for Data-Intensive Computing on Cloud Platforms","authors":"Wei Dai, Ibrahim Adel Ibrahim, M. Bassiouni","doi":"10.1109/CSCloud.2017.64","DOIUrl":null,"url":null,"abstract":"One of the challenges faced by data-intensive computing is the problem of stragglers, which can significantly increase the job completion time. Various proactive and reactive straggler mitigation techniques have been developed to address the problem. The straggler identification scheme is a crucial part of the straggler mitigation techniques, as only when stragglers are detected not only correctly but also early enough, the improvement in job completion time can make a real difference. Although the classical standard deviation method is a widely adopted straggler identification scheme, it is not an ideal solution due to certain inherent limitations. In this paper, we present Tukey's method, another statistical method for outlier detection, which is more suitable for the identification of stragglers for two reasons. First, it is robust to extreme observations from stragglers. Second, it can identify stragglers and, more importantly, start speculative execution earlier than the standard deviation method. Our extensive simulation results confirm that Tukey's method can remarkably outperform the standard deviation method.","PeriodicalId":436299,"journal":{"name":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","volume":"47 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCloud.2017.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
One of the challenges faced by data-intensive computing is the problem of stragglers, which can significantly increase the job completion time. Various proactive and reactive straggler mitigation techniques have been developed to address the problem. The straggler identification scheme is a crucial part of the straggler mitigation techniques, as only when stragglers are detected not only correctly but also early enough, the improvement in job completion time can make a real difference. Although the classical standard deviation method is a widely adopted straggler identification scheme, it is not an ideal solution due to certain inherent limitations. In this paper, we present Tukey's method, another statistical method for outlier detection, which is more suitable for the identification of stragglers for two reasons. First, it is robust to extreme observations from stragglers. Second, it can identify stragglers and, more importantly, start speculative execution earlier than the standard deviation method. Our extensive simulation results confirm that Tukey's method can remarkably outperform the standard deviation method.