Competitive cost-effective memory access predictor through short-term online SVM and dynamic vocabularies

IF 6.2 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-11-05 DOI:10.1016/j.future.2024.107592

Pablo Sanchez-Cuevas , Fernando Diaz-del-Rio , Daniel Casanueva-Morato , Antonio Rios-Navarro

{"title":"Competitive cost-effective memory access predictor through short-term online SVM and dynamic vocabularies","authors":"Pablo Sanchez-Cuevas , Fernando Diaz-del-Rio , Daniel Casanueva-Morato , Antonio Rios-Navarro","doi":"10.1016/j.future.2024.107592","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, there has been a significant increase in the processing of massive amounts of data, driven by the growing demands of mobile systems, parallel and distributed architectures, and real-time systems. This applies to various types of platforms, both specific and general-purpose. Despite numerous advancements in Computer Systems, a critical challenge remains: the efficiency and speed of memory access. This bottleneck is being addressed through cache prefetching, that is, by predicting the next memory address to be accessed and then by having always prefetched in the cache system those data to be used shortly by the processor. This paper explores established intelligent techniques for address prediction, examining their limitations and analyzing the memory access patterns of popular software applications. Building on the successes of previous intelligent predictors based on Machine and Deep Learning models, we introduce a new predictor, SVM4AP (Support Vector Machine For Address Prediction), designed to overcome the identified drawbacks of its predecessors. The architecture of SVM4AP improves the trade-off between performance and cost, compared to those previous proposals in the literature, achieving high accuracy through short-term learning. Comparisons are made with two prominent predictors from the literature: the classical DFCM (Differential Finite Context Method) and the contemporary Deep Learning-based DCLSTM (Doubly Compressed Long-Short Term Memory). The results demonstrate that SVM4AP achieves superior cost-effectiveness across various configurations. Simulations reveal that SVM4AP configurations dominate both DFCM and DCLSTM counterparts, forming the majority of the first Paretto front. Particularly noteworthy is the significant advantage of our proposal for small-size predictors. Furthermore, we release an open-source tool enabling the scientific community to reproduce the results presented in this paper using a set of benchmark traces.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"164 ","pages":"Article 107592"},"PeriodicalIF":6.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24005569","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, there has been a significant increase in the processing of massive amounts of data, driven by the growing demands of mobile systems, parallel and distributed architectures, and real-time systems. This applies to various types of platforms, both specific and general-purpose. Despite numerous advancements in Computer Systems, a critical challenge remains: the efficiency and speed of memory access. This bottleneck is being addressed through cache prefetching, that is, by predicting the next memory address to be accessed and then by having always prefetched in the cache system those data to be used shortly by the processor. This paper explores established intelligent techniques for address prediction, examining their limitations and analyzing the memory access patterns of popular software applications. Building on the successes of previous intelligent predictors based on Machine and Deep Learning models, we introduce a new predictor, SVM4AP (Support Vector Machine For Address Prediction), designed to overcome the identified drawbacks of its predecessors. The architecture of SVM4AP improves the trade-off between performance and cost, compared to those previous proposals in the literature, achieving high accuracy through short-term learning. Comparisons are made with two prominent predictors from the literature: the classical DFCM (Differential Finite Context Method) and the contemporary Deep Learning-based DCLSTM (Doubly Compressed Long-Short Term Memory). The results demonstrate that SVM4AP achieves superior cost-effectiveness across various configurations. Simulations reveal that SVM4AP configurations dominate both DFCM and DCLSTM counterparts, forming the majority of the first Paretto front. Particularly noteworthy is the significant advantage of our proposal for small-size predictors. Furthermore, we release an open-source tool enabling the scientific community to reproduce the results presented in this paper using a set of benchmark traces.

查看原文本刊更多论文

通过短期在线 SVM 和动态词汇表实现具有竞争力的高性价比内存访问预测器

近年来，由于移动系统、并行和分布式架构以及实时系统的需求不断增长，海量数据的处理量显著增加。这适用于各种类型的平台，包括专用平台和通用平台。尽管计算机系统取得了众多进步，但一个关键挑战依然存在：内存访问的效率和速度。这一瓶颈问题正通过高速缓存预取来解决，即预测下一个要访问的内存地址，然后在高速缓存系统中预取处理器即将使用的数据。本文探讨了已有的地址预测智能技术，研究了这些技术的局限性，并分析了流行软件应用程序的内存访问模式。在之前基于机器学习和深度学习模型的智能预测器取得成功的基础上，我们引入了一种新的预测器 SVM4AP（支持向量机地址预测），旨在克服前者已发现的缺点。与之前文献中的建议相比，SVM4AP 的架构改善了性能与成本之间的权衡，通过短期学习实现了高准确性。我们将 SVM4AP 与文献中两种著名的预测器进行了比较：经典的 DFCM（差分有限上下文法）和当代基于深度学习的 DCLSTM（双倍压缩长短期记忆）。结果表明，SVM4AP 在各种配置下都能实现卓越的成本效益。模拟结果表明，SVM4AP 配置在 DFCM 和 DCLSTM 对应配置中均占优势，构成了第一帕雷托前沿的大部分。尤其值得注意的是，我们的建议对小型预测器具有显著优势。此外，我们还发布了一个开源工具，使科学界能够使用一组基准迹线重现本文介绍的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Generation Computer Systems-The International Journal of Escience 工程技术-计算机：理论方法

CiteScore

19.90

自引率

2.70%

发文量

376

审稿时长

10.6 months

期刊介绍： Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.