Competitive cost-effective memory access predictor through short-term online SVM and dynamic vocabularies

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Pablo Sanchez-Cuevas , Fernando Diaz-del-Rio , Daniel Casanueva-Morato , Antonio Rios-Navarro
{"title":"Competitive cost-effective memory access predictor through short-term online SVM and dynamic vocabularies","authors":"Pablo Sanchez-Cuevas ,&nbsp;Fernando Diaz-del-Rio ,&nbsp;Daniel Casanueva-Morato ,&nbsp;Antonio Rios-Navarro","doi":"10.1016/j.future.2024.107592","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, there has been a significant increase in the processing of massive amounts of data, driven by the growing demands of mobile systems, parallel and distributed architectures, and real-time systems. This applies to various types of platforms, both specific and general-purpose. Despite numerous advancements in Computer Systems, a critical challenge remains: the efficiency and speed of memory access. This bottleneck is being addressed through cache prefetching, that is, by predicting the next memory address to be accessed and then by having always prefetched in the cache system those data to be used shortly by the processor. This paper explores established intelligent techniques for address prediction, examining their limitations and analyzing the memory access patterns of popular software applications. Building on the successes of previous intelligent predictors based on Machine and Deep Learning models, we introduce a new predictor, SVM4AP (Support Vector Machine For Address Prediction), designed to overcome the identified drawbacks of its predecessors. The architecture of SVM4AP improves the trade-off between performance and cost, compared to those previous proposals in the literature, achieving high accuracy through short-term learning. Comparisons are made with two prominent predictors from the literature: the classical DFCM (Differential Finite Context Method) and the contemporary Deep Learning-based DCLSTM (Doubly Compressed Long-Short Term Memory). The results demonstrate that SVM4AP achieves superior cost-effectiveness across various configurations. Simulations reveal that SVM4AP configurations dominate both DFCM and DCLSTM counterparts, forming the majority of the first Paretto front. Particularly noteworthy is the significant advantage of our proposal for small-size predictors. Furthermore, we release an open-source tool enabling the scientific community to reproduce the results presented in this paper using a set of benchmark traces.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"164 ","pages":"Article 107592"},"PeriodicalIF":6.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24005569","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, there has been a significant increase in the processing of massive amounts of data, driven by the growing demands of mobile systems, parallel and distributed architectures, and real-time systems. This applies to various types of platforms, both specific and general-purpose. Despite numerous advancements in Computer Systems, a critical challenge remains: the efficiency and speed of memory access. This bottleneck is being addressed through cache prefetching, that is, by predicting the next memory address to be accessed and then by having always prefetched in the cache system those data to be used shortly by the processor. This paper explores established intelligent techniques for address prediction, examining their limitations and analyzing the memory access patterns of popular software applications. Building on the successes of previous intelligent predictors based on Machine and Deep Learning models, we introduce a new predictor, SVM4AP (Support Vector Machine For Address Prediction), designed to overcome the identified drawbacks of its predecessors. The architecture of SVM4AP improves the trade-off between performance and cost, compared to those previous proposals in the literature, achieving high accuracy through short-term learning. Comparisons are made with two prominent predictors from the literature: the classical DFCM (Differential Finite Context Method) and the contemporary Deep Learning-based DCLSTM (Doubly Compressed Long-Short Term Memory). The results demonstrate that SVM4AP achieves superior cost-effectiveness across various configurations. Simulations reveal that SVM4AP configurations dominate both DFCM and DCLSTM counterparts, forming the majority of the first Paretto front. Particularly noteworthy is the significant advantage of our proposal for small-size predictors. Furthermore, we release an open-source tool enabling the scientific community to reproduce the results presented in this paper using a set of benchmark traces.
通过短期在线 SVM 和动态词汇表实现具有竞争力的高性价比内存访问预测器
近年来,由于移动系统、并行和分布式架构以及实时系统的需求不断增长,海量数据的处理量显著增加。这适用于各种类型的平台,包括专用平台和通用平台。尽管计算机系统取得了众多进步,但一个关键挑战依然存在:内存访问的效率和速度。这一瓶颈问题正通过高速缓存预取来解决,即预测下一个要访问的内存地址,然后在高速缓存系统中预取处理器即将使用的数据。本文探讨了已有的地址预测智能技术,研究了这些技术的局限性,并分析了流行软件应用程序的内存访问模式。在之前基于机器学习和深度学习模型的智能预测器取得成功的基础上,我们引入了一种新的预测器 SVM4AP(支持向量机地址预测),旨在克服前者已发现的缺点。与之前文献中的建议相比,SVM4AP 的架构改善了性能与成本之间的权衡,通过短期学习实现了高准确性。我们将 SVM4AP 与文献中两种著名的预测器进行了比较:经典的 DFCM(差分有限上下文法)和当代基于深度学习的 DCLSTM(双倍压缩长短期记忆)。结果表明,SVM4AP 在各种配置下都能实现卓越的成本效益。模拟结果表明,SVM4AP 配置在 DFCM 和 DCLSTM 对应配置中均占优势,构成了第一帕雷托前沿的大部分。尤其值得注意的是,我们的建议对小型预测器具有显著优势。此外,我们还发布了一个开源工具,使科学界能够使用一组基准迹线重现本文介绍的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信