CALC:存储系统的内容感知学习缓存

2021 IEEE International Conference on Networking, Architecture and Storage (NAS) Pub Date : 2021-10-01 DOI:10.1109/nas51552.2021.9605381

Maher Kachmar, D. Kaeli

{"title":"CALC:存储系统的内容感知学习缓存","authors":"Maher Kachmar, D. Kaeli","doi":"10.1109/nas51552.2021.9605381","DOIUrl":null,"url":null,"abstract":"In today’s enterprise storage systems, supported services such as data deduplication are becoming a common feature adopted in the data center, especially as new storage technologies mature. Static partitioning of storage system resources, including CPU cores and memory caches, may lead to missing Service Level Agreement (SLAs) thresholds, such as the Data Reduction Rate (DRR) or IO latency. However, typical storage system applications exhibit a workload pattern that can be learned. By learning these pattern, we are better equipped to address several storage system resource partitioning challenges, issues that cannot be overcome with traditional manual tuning and primitive feedback mechanisms.We propose a Content-Aware Learning Cache (CALC) that uses online reinforcement learning models (Q-Learning, SARSA and Actor-Critic) to actively partition the storage system cache between a data digest cache, content cache, and address-based data cache to improve cache hit performance, while maximizing data reduction rates. Using traces from popular storage applications, we show how our machine learning approach is robust and can out-perform an iterative search method for various datasets and cache sizes. Our content-aware learning cache improves hit rates by 7.1% when compared to iterative search methods, and 18.2% when compared to traditional LRU-based data cache implementation.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CALC: A Content-Aware Learning Cache for Storage Systems\",\"authors\":\"Maher Kachmar, D. Kaeli\",\"doi\":\"10.1109/nas51552.2021.9605381\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today’s enterprise storage systems, supported services such as data deduplication are becoming a common feature adopted in the data center, especially as new storage technologies mature. Static partitioning of storage system resources, including CPU cores and memory caches, may lead to missing Service Level Agreement (SLAs) thresholds, such as the Data Reduction Rate (DRR) or IO latency. However, typical storage system applications exhibit a workload pattern that can be learned. By learning these pattern, we are better equipped to address several storage system resource partitioning challenges, issues that cannot be overcome with traditional manual tuning and primitive feedback mechanisms.We propose a Content-Aware Learning Cache (CALC) that uses online reinforcement learning models (Q-Learning, SARSA and Actor-Critic) to actively partition the storage system cache between a data digest cache, content cache, and address-based data cache to improve cache hit performance, while maximizing data reduction rates. Using traces from popular storage applications, we show how our machine learning approach is robust and can out-perform an iterative search method for various datasets and cache sizes. Our content-aware learning cache improves hit rates by 7.1% when compared to iterative search methods, and 18.2% when compared to traditional LRU-based data cache implementation.\",\"PeriodicalId\":135930,\"journal\":{\"name\":\"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/nas51552.2021.9605381\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/nas51552.2021.9605381","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在当今的企业存储系统中，随着新的存储技术的成熟，支持的重复数据删除等服务已经成为数据中心普遍采用的特性。对存储系统资源(包括CPU内核和内存缓存)进行静态分区，可能导致sla (Service Level Agreement)阈值缺失，如DRR (Data Reduction Rate)、IO时延等。但是，典型的存储系统应用程序表现出一种可以学习的工作负载模式。通过学习这些模式，我们可以更好地解决几个存储系统资源分区挑战，这些问题是传统的手动调优和原始反馈机制无法克服的。我们提出了一个内容感知学习缓存(CALC)，它使用在线强化学习模型(Q-Learning, SARSA和Actor-Critic)在数据摘要缓存，内容缓存和基于地址的数据缓存之间主动分区存储系统缓存，以提高缓存命中率，同时最大化数据减少率。通过使用流行存储应用程序的跟踪，我们展示了我们的机器学习方法是如何鲁棒的，并且可以在各种数据集和缓存大小的迭代搜索方法中胜过迭代搜索方法。与迭代搜索方法相比，我们的内容感知学习缓存的命中率提高了7.1%，与传统的基于lru的数据缓存实现相比，命中率提高了18.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CALC: A Content-Aware Learning Cache for Storage Systems

In today’s enterprise storage systems, supported services such as data deduplication are becoming a common feature adopted in the data center, especially as new storage technologies mature. Static partitioning of storage system resources, including CPU cores and memory caches, may lead to missing Service Level Agreement (SLAs) thresholds, such as the Data Reduction Rate (DRR) or IO latency. However, typical storage system applications exhibit a workload pattern that can be learned. By learning these pattern, we are better equipped to address several storage system resource partitioning challenges, issues that cannot be overcome with traditional manual tuning and primitive feedback mechanisms.We propose a Content-Aware Learning Cache (CALC) that uses online reinforcement learning models (Q-Learning, SARSA and Actor-Critic) to actively partition the storage system cache between a data digest cache, content cache, and address-based data cache to improve cache hit performance, while maximizing data reduction rates. Using traces from popular storage applications, we show how our machine learning approach is robust and can out-perform an iterative search method for various datasets and cache sizes. Our content-aware learning cache improves hit rates by 7.1% when compared to iterative search methods, and 18.2% when compared to traditional LRU-based data cache implementation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Networking, Architecture and Storage (NAS)

自引率

0.00%

发文量