Luis Mattos, D. C. S. Lucas, Juan Salamanca, J. P. L. Carvalho, M. Pereira, G. Araújo
{"title":"DOACROSS Parallelization Based on Component Annotation and Loop-Carried Probability","authors":"Luis Mattos, D. C. S. Lucas, Juan Salamanca, J. P. L. Carvalho, M. Pereira, G. Araújo","doi":"10.1109/CAHPC.2018.8645904","DOIUrl":null,"url":null,"abstract":"Although modern compilers implement many loop parallelization techniques, their application is typically restricted to loops that have no loop-carried dependences (DOALL) or that contain well-known structured dependence patterns (e.g. reduction). These restrictions preclude the parallelization of many computational intensive DOACROSS loops. In such loops, either the compiler finds at least one loop-carried dependence or it cannot prove, at compile-time, that the loop is free of such dependences, even though they might never show-up at runtime. In any case, most compilers end-up not parallelizing DOACROSS loops. This paper brings three contributions to address this problem. First, it integrates three algorithms (TLS, DOAX, and BDX) into a simple openMP clause that enables the programmer to select the best algorithm for a given loop. Second, it proposes an annotation approach to separate the sequential components of a loop, thus exposing other components to parallelization. Finally, it shows that loop-carried probability is an effective metric to decide when to use TLS or other non-speculative techniques (e.g. DOAX or BDX) to parallelize DOACROSS loops. Experimental results reveal that, for certain loops, slow-downs can be transformed in 2×speed-ups by quickly selecting the appropriate algorithm.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAHPC.2018.8645904","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Although modern compilers implement many loop parallelization techniques, their application is typically restricted to loops that have no loop-carried dependences (DOALL) or that contain well-known structured dependence patterns (e.g. reduction). These restrictions preclude the parallelization of many computational intensive DOACROSS loops. In such loops, either the compiler finds at least one loop-carried dependence or it cannot prove, at compile-time, that the loop is free of such dependences, even though they might never show-up at runtime. In any case, most compilers end-up not parallelizing DOACROSS loops. This paper brings three contributions to address this problem. First, it integrates three algorithms (TLS, DOAX, and BDX) into a simple openMP clause that enables the programmer to select the best algorithm for a given loop. Second, it proposes an annotation approach to separate the sequential components of a loop, thus exposing other components to parallelization. Finally, it shows that loop-carried probability is an effective metric to decide when to use TLS or other non-speculative techniques (e.g. DOAX or BDX) to parallelize DOACROSS loops. Experimental results reveal that, for certain loops, slow-downs can be transformed in 2×speed-ups by quickly selecting the appropriate algorithm.