CALC: A Content-Aware Learning Cache for Storage Systems

Maher Kachmar, D. Kaeli
{"title":"CALC: A Content-Aware Learning Cache for Storage Systems","authors":"Maher Kachmar, D. Kaeli","doi":"10.1109/nas51552.2021.9605381","DOIUrl":null,"url":null,"abstract":"In today’s enterprise storage systems, supported services such as data deduplication are becoming a common feature adopted in the data center, especially as new storage technologies mature. Static partitioning of storage system resources, including CPU cores and memory caches, may lead to missing Service Level Agreement (SLAs) thresholds, such as the Data Reduction Rate (DRR) or IO latency. However, typical storage system applications exhibit a workload pattern that can be learned. By learning these pattern, we are better equipped to address several storage system resource partitioning challenges, issues that cannot be overcome with traditional manual tuning and primitive feedback mechanisms.We propose a Content-Aware Learning Cache (CALC) that uses online reinforcement learning models (Q-Learning, SARSA and Actor-Critic) to actively partition the storage system cache between a data digest cache, content cache, and address-based data cache to improve cache hit performance, while maximizing data reduction rates. Using traces from popular storage applications, we show how our machine learning approach is robust and can out-perform an iterative search method for various datasets and cache sizes. Our content-aware learning cache improves hit rates by 7.1% when compared to iterative search methods, and 18.2% when compared to traditional LRU-based data cache implementation.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/nas51552.2021.9605381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In today’s enterprise storage systems, supported services such as data deduplication are becoming a common feature adopted in the data center, especially as new storage technologies mature. Static partitioning of storage system resources, including CPU cores and memory caches, may lead to missing Service Level Agreement (SLAs) thresholds, such as the Data Reduction Rate (DRR) or IO latency. However, typical storage system applications exhibit a workload pattern that can be learned. By learning these pattern, we are better equipped to address several storage system resource partitioning challenges, issues that cannot be overcome with traditional manual tuning and primitive feedback mechanisms.We propose a Content-Aware Learning Cache (CALC) that uses online reinforcement learning models (Q-Learning, SARSA and Actor-Critic) to actively partition the storage system cache between a data digest cache, content cache, and address-based data cache to improve cache hit performance, while maximizing data reduction rates. Using traces from popular storage applications, we show how our machine learning approach is robust and can out-perform an iterative search method for various datasets and cache sizes. Our content-aware learning cache improves hit rates by 7.1% when compared to iterative search methods, and 18.2% when compared to traditional LRU-based data cache implementation.
CALC:存储系统的内容感知学习缓存
在当今的企业存储系统中,随着新的存储技术的成熟,支持的重复数据删除等服务已经成为数据中心普遍采用的特性。对存储系统资源(包括CPU内核和内存缓存)进行静态分区,可能导致sla (Service Level Agreement)阈值缺失,如DRR (Data Reduction Rate)、IO时延等。但是,典型的存储系统应用程序表现出一种可以学习的工作负载模式。通过学习这些模式,我们可以更好地解决几个存储系统资源分区挑战,这些问题是传统的手动调优和原始反馈机制无法克服的。我们提出了一个内容感知学习缓存(CALC),它使用在线强化学习模型(Q-Learning, SARSA和Actor-Critic)在数据摘要缓存,内容缓存和基于地址的数据缓存之间主动分区存储系统缓存,以提高缓存命中率,同时最大化数据减少率。通过使用流行存储应用程序的跟踪,我们展示了我们的机器学习方法是如何鲁棒的,并且可以在各种数据集和缓存大小的迭代搜索方法中胜过迭代搜索方法。与迭代搜索方法相比,我们的内容感知学习缓存的命中率提高了7.1%,与传统的基于lru的数据缓存实现相比,命中率提高了18.2%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信