比较并结合了读缺失聚类和软件预取

Vijay S. Pai, S. Adve
{"title":"比较并结合了读缺失聚类和软件预取","authors":"Vijay S. Pai, S. Adve","doi":"10.1109/PACT.2001.953310","DOIUrl":null,"url":null,"abstract":"A recent latency tolerance technique, read-miss clustering, restructures code to send demand-miss references in parallel to the underlying memory system. An alternative, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand-miss references by a certain distance. Since both techniques seem to target the same types of latencies and use the same system resources, it is unclear which technique is superior or if both can be combined. This paper shows that these two techniques are actually mutually beneficial, each helping to overcome limitations of the other: We perform our study for uniprocessor and multiprocessor configurations, in simulation and on a real machine (the Convex Exemplar). Compared to prefetching alone (the state-of-the-art implemented in systems today), the combination of the two techniques reduces the execution time by an average of 21% across all cases studied in simulation, and by an average of 16% for 5 out of 10 cases on the Exemplar. The combination sees execution time reductions relative to clustering alone averaging 15% for 6 out of 11 cases in simulation and 20% for 6 out of 10 cases on the Exemplar.","PeriodicalId":276650,"journal":{"name":"Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Comparing and combining read miss clustering and software prefetching\",\"authors\":\"Vijay S. Pai, S. Adve\",\"doi\":\"10.1109/PACT.2001.953310\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A recent latency tolerance technique, read-miss clustering, restructures code to send demand-miss references in parallel to the underlying memory system. An alternative, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand-miss references by a certain distance. Since both techniques seem to target the same types of latencies and use the same system resources, it is unclear which technique is superior or if both can be combined. This paper shows that these two techniques are actually mutually beneficial, each helping to overcome limitations of the other: We perform our study for uniprocessor and multiprocessor configurations, in simulation and on a real machine (the Convex Exemplar). Compared to prefetching alone (the state-of-the-art implemented in systems today), the combination of the two techniques reduces the execution time by an average of 21% across all cases studied in simulation, and by an average of 16% for 5 out of 10 cases on the Exemplar. The combination sees execution time reductions relative to clustering alone averaging 15% for 6 out of 11 cases in simulation and 20% for 6 out of 10 cases on the Exemplar.\",\"PeriodicalId\":276650,\"journal\":{\"name\":\"Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.2001.953310\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2001.953310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

最近的一种延迟容忍技术,读缺失集群,重构代码并发发送请求缺失引用到底层内存系统。另一种广泛使用的延迟容忍技术是软件预取,它比预期的需求缺失引用提前一定距离启动数据提取。由于这两种技术似乎针对相同类型的延迟并使用相同的系统资源,因此尚不清楚哪种技术更好,或者两者是否可以结合使用。本文表明,这两种技术实际上是相互有益的,每一种技术都有助于克服另一种技术的局限性:我们在模拟和真实机器(凸范例)上对单处理器和多处理器配置进行了研究。与单独预取(目前系统中实现的最先进技术)相比,这两种技术的结合在模拟中研究的所有案例中平均减少了21%的执行时间,在Exemplar上10个案例中有5个案例平均减少了16%的执行时间。在模拟的11个案例中,有6个案例的执行时间比单独的集群平均减少了15%,在Exemplar的10个案例中,有6个案例的执行时间减少了20%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparing and combining read miss clustering and software prefetching
A recent latency tolerance technique, read-miss clustering, restructures code to send demand-miss references in parallel to the underlying memory system. An alternative, widely-used latency tolerance technique is software prefetching, which initiates data fetches ahead of expected demand-miss references by a certain distance. Since both techniques seem to target the same types of latencies and use the same system resources, it is unclear which technique is superior or if both can be combined. This paper shows that these two techniques are actually mutually beneficial, each helping to overcome limitations of the other: We perform our study for uniprocessor and multiprocessor configurations, in simulation and on a real machine (the Convex Exemplar). Compared to prefetching alone (the state-of-the-art implemented in systems today), the combination of the two techniques reduces the execution time by an average of 21% across all cases studied in simulation, and by an average of 16% for 5 out of 10 cases on the Exemplar. The combination sees execution time reductions relative to clustering alone averaging 15% for 6 out of 11 cases in simulation and 20% for 6 out of 10 cases on the Exemplar.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信