Xiuwu Gao, Jun Jiang, Liangming Huang, Hongmei Wei
{"title":"Migration and Tuning of Software Prefetching for Sunway Multi-Core Processor","authors":"Xiuwu Gao, Jun Jiang, Liangming Huang, Hongmei Wei","doi":"10.1109/ICSP54964.2022.9778713","DOIUrl":null,"url":null,"abstract":"Data prefetching is a widely used technique to alleviate \"memory wall\" problem by fetching the data that may be touched in the near future in advance. Generally, data prefetching is classified into hardware prefetching and software prefetching. Compared to hardware prefetching, software prefetching is more flexible, and typically achieves higher prefetch accuracy. Currently, Sunway multiple-core processor only supports hardware prefetching. To study how software prefetching perform on Sunway processor, in this paper, we first migrate the software data prefetching in the GCC complier to Sunway processor. Then we tune the loop-level prefetching cost model according to Sunway processor’s hardware features. Finally, we conduct experiments to evaluate the effectiveness of the tuned software prefetching. Results show that, compared to the baseline where no prefetching is applied, software prefetching delivers an average speedup of 1.08x (up to 2.46x) and 1.16x (up to 1.88x) for SPECint 2006 and SPECfp 2006 benchmark suite, respectively. Moreover, software prefetching outperforms hardware prefetching for both benchmark suites. This demonstrates the efficacy of software data prefetching.","PeriodicalId":363766,"journal":{"name":"2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSP54964.2022.9778713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data prefetching is a widely used technique to alleviate "memory wall" problem by fetching the data that may be touched in the near future in advance. Generally, data prefetching is classified into hardware prefetching and software prefetching. Compared to hardware prefetching, software prefetching is more flexible, and typically achieves higher prefetch accuracy. Currently, Sunway multiple-core processor only supports hardware prefetching. To study how software prefetching perform on Sunway processor, in this paper, we first migrate the software data prefetching in the GCC complier to Sunway processor. Then we tune the loop-level prefetching cost model according to Sunway processor’s hardware features. Finally, we conduct experiments to evaluate the effectiveness of the tuned software prefetching. Results show that, compared to the baseline where no prefetching is applied, software prefetching delivers an average speedup of 1.08x (up to 2.46x) and 1.16x (up to 1.88x) for SPECint 2006 and SPECfp 2006 benchmark suite, respectively. Moreover, software prefetching outperforms hardware prefetching for both benchmark suites. This demonstrates the efficacy of software data prefetching.