An End-to-End Automatic Cache Replacement Policy Using Deep Reinforcement Learning

International Conference on Automated Planning and Scheduling Pub Date : 2022-06-13 DOI:10.1609/icaps.v32i1.19840

Yangqing Zhou, Fang Wang, Zhan Shi, D. Feng

{"title":"An End-to-End Automatic Cache Replacement Policy Using Deep Reinforcement Learning","authors":"Yangqing Zhou, Fang Wang, Zhan Shi, D. Feng","doi":"10.1609/icaps.v32i1.19840","DOIUrl":null,"url":null,"abstract":"In the past few decades, much research has been conducted on the design of cache replacement policies. Prior work frequently relies on manually-engineered heuristics to capture the most common cache access patterns, or predict the reuse distance and try to identify the blocks that are either cache-friendly or cache-averse. Researchers are now applying recent advances in machine learning to guide cache replacement policy, augmenting or replacing traditional heuristics and data structures. However, most existing approaches depend on the certain environment which restricted their application, e.g, most of the approaches only consider the on-chip cache consisting of program counters (PCs). Moreover, those approaches with attractive hit rates are usually unable to deal with modern irregular workloads, due to the limited feature used. In contrast, we propose a pervasive cache replacement framework to automatically learn the relationship between the probability distribution of different replacement policies and workload distribution by using deep reinforcement learning. We train an end-to-end cache replacement policy only on the past requested address through two simple and stable cache replacement policies. Furthermore, the overall framework can be easily plugged into any scenario that requires cache. Our simulation results on 8 production storage traces run against 3 different cache configurations confirm that the proposed cache replacement policy is effective and outperforms several state-of-the-art approaches.","PeriodicalId":239898,"journal":{"name":"International Conference on Automated Planning and Scheduling","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Automated Planning and Scheduling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icaps.v32i1.19840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

In the past few decades, much research has been conducted on the design of cache replacement policies. Prior work frequently relies on manually-engineered heuristics to capture the most common cache access patterns, or predict the reuse distance and try to identify the blocks that are either cache-friendly or cache-averse. Researchers are now applying recent advances in machine learning to guide cache replacement policy, augmenting or replacing traditional heuristics and data structures. However, most existing approaches depend on the certain environment which restricted their application, e.g, most of the approaches only consider the on-chip cache consisting of program counters (PCs). Moreover, those approaches with attractive hit rates are usually unable to deal with modern irregular workloads, due to the limited feature used. In contrast, we propose a pervasive cache replacement framework to automatically learn the relationship between the probability distribution of different replacement policies and workload distribution by using deep reinforcement learning. We train an end-to-end cache replacement policy only on the past requested address through two simple and stable cache replacement policies. Furthermore, the overall framework can be easily plugged into any scenario that requires cache. Our simulation results on 8 production storage traces run against 3 different cache configurations confirm that the proposed cache replacement policy is effective and outperforms several state-of-the-art approaches.

查看原文本刊更多论文

基于深度强化学习的端到端自动缓存替换策略

在过去的几十年里，人们对缓存替换策略的设计进行了大量的研究。以前的工作经常依赖于人工设计的启发式方法来捕获最常见的缓存访问模式，或者预测重用距离，并尝试识别缓存友好或缓存厌恶的块。研究人员现在正在应用机器学习的最新进展来指导缓存替换策略，增强或取代传统的启发式方法和数据结构。然而，现有的大多数方法都依赖于特定的环境，这限制了它们的应用，例如，大多数方法只考虑由程序计数器(pc)组成的片上缓存。此外，由于所使用的功能有限，那些具有吸引人的命中率的方法通常无法处理现代不规则的工作负载。在此基础上，我们提出了一种普普性缓存替换框架，通过深度强化学习自动学习不同替换策略的概率分布与工作负载分布之间的关系。通过两种简单稳定的缓存替换策略，只对过去请求的地址进行端到端缓存替换策略的训练。此外，整个框架可以很容易地插入到任何需要缓存的场景中。我们对3种不同缓存配置的8种生产存储跟踪进行了模拟，结果证实了所提出的缓存替换策略是有效的，并且优于几种最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Automated Planning and Scheduling

自引率

0.00%

发文量