Álvaro García-Yágüez, D. Ferraris, Arturo González-Escribano
{"title":"Exclusive squashing for thread-level speculation","authors":"Álvaro García-Yágüez, D. Ferraris, Arturo González-Escribano","doi":"10.1145/1996130.1996172","DOIUrl":null,"url":null,"abstract":"Speculative parallelization is a runtime technique that optimistically executes sequential code in parallel, checking that no dependence violations appear. In this paper, we address the problem of minimizing the number of threads that should be restarted when a data dependence violation is found. We present a new mechanism that keeps track of inter-thread dependencies in order to selectively stop and restart offending threads, and all threads that have consumed data from them. Results show a reduction of 38.5% to 81.8% in the number of restarted threads for real application loops and up to a 10% speedup, depending on the amount of local computation.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"960 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1996130.1996172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Speculative parallelization is a runtime technique that optimistically executes sequential code in parallel, checking that no dependence violations appear. In this paper, we address the problem of minimizing the number of threads that should be restarted when a data dependence violation is found. We present a new mechanism that keeps track of inter-thread dependencies in order to selectively stop and restart offending threads, and all threads that have consumed data from them. Results show a reduction of 38.5% to 81.8% in the number of restarted threads for real application loops and up to a 10% speedup, depending on the amount of local computation.