{"title":"Dynamic SIMD re-convergence with paired-path comparison","authors":"Yun-Chi Huang, Kuan-Chieh Hsu, Wan-shan Hsieh, Chen-Chieh Wang, Chia-Han Lu, C. Chen","doi":"10.1109/ISCAS.2016.7527213","DOIUrl":null,"url":null,"abstract":"SIMD divergence is one of the critical factors that decrease the hardware utilization in contemporary GPGPUs (General Purpose Graphic Processor Unit). Both the reconvergence scheme and control flow detection have to be well considered. In the emerging HSA (Heterogeneous System Architecture) platform, we develop an effective dynamic stack-based re-convergence scheme that can be implemented without the insertion of re-convergence instructions generated by the finalizer. The stack keeps track of the minimal necessary information of the taken and non-taken paths; the additional end-of-branch instruction insertion is no longer required under our design. Using the scheme we propose, the divergent warp dynamically re-converges at opportunistic re-convergence points. The activity factor improves for 13.36% on average from opportunistic early re-convergence in the unstructured control flow. Our design has eased the development of a finalizer that no longer needs to reason about the reconvergence point after a branch divergence, especially for unstructured control flow.","PeriodicalId":6546,"journal":{"name":"2016 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"78 1","pages":"233-236"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Symposium on Circuits and Systems (ISCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAS.2016.7527213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
SIMD divergence is one of the critical factors that decrease the hardware utilization in contemporary GPGPUs (General Purpose Graphic Processor Unit). Both the reconvergence scheme and control flow detection have to be well considered. In the emerging HSA (Heterogeneous System Architecture) platform, we develop an effective dynamic stack-based re-convergence scheme that can be implemented without the insertion of re-convergence instructions generated by the finalizer. The stack keeps track of the minimal necessary information of the taken and non-taken paths; the additional end-of-branch instruction insertion is no longer required under our design. Using the scheme we propose, the divergent warp dynamically re-converges at opportunistic re-convergence points. The activity factor improves for 13.36% on average from opportunistic early re-convergence in the unstructured control flow. Our design has eased the development of a finalizer that no longer needs to reason about the reconvergence point after a branch divergence, especially for unstructured control flow.