数据处理网络中基于块缓存的深度q -学习

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton) Pub Date : 2019-09-01 DOI:10.1109/ALLERTON.2019.8919777

Yimeng Wang, Yongbo Li, Tian Lan, V. Aggarwal

{"title":"数据处理网络中基于块缓存的深度q -学习","authors":"Yimeng Wang, Yongbo Li, Tian Lan, V. Aggarwal","doi":"10.1109/ALLERTON.2019.8919777","DOIUrl":null,"url":null,"abstract":"A Data Processing Network (DPN) streams massive volumes of data collected and stored by the network to multiple processing units to compute desired results in a timely fashion. Due to ever-increasing traffic, distributed cache nodes can be deployed to store hot data and rapidly deliver them for consumption. However, prior work on caching policies has primarily focused on the potential gains in network performance, e.g., cache hit ratio and download latency, while neglecting the impact of cache on data processing and consumption.In this paper, we propose a novel framework, DeepChunk, which leverages deep Q-learning for chunk-based caching in DPN. We show that cache policies must be optimized for both network performance during data delivery and processing efficiency during data consumption. Specifically, DeepChunk utilizes a model-free approach by jointly learning limited network, data streaming, and processing statistics at runtime and making cache update decisions under the guidance of powerful deep Q-learning. It enables a joint optimization of multiple objectives including chunk hit ratio, processing stall time, and object download time while being self-adaptive under the time-varying workload and network conditions. We build a prototype implementation of DeepChunk with Ceph, a popular distributed object storage system. Our extensive experiments and evaluation demonstrate significant improvement, i.e., 43% in total reward and 39% in processing stall time, over a number of baseline caching policies.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Q-Learning for Chunk-based Caching in Data Processing Networks\",\"authors\":\"Yimeng Wang, Yongbo Li, Tian Lan, V. Aggarwal\",\"doi\":\"10.1109/ALLERTON.2019.8919777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Data Processing Network (DPN) streams massive volumes of data collected and stored by the network to multiple processing units to compute desired results in a timely fashion. Due to ever-increasing traffic, distributed cache nodes can be deployed to store hot data and rapidly deliver them for consumption. However, prior work on caching policies has primarily focused on the potential gains in network performance, e.g., cache hit ratio and download latency, while neglecting the impact of cache on data processing and consumption.In this paper, we propose a novel framework, DeepChunk, which leverages deep Q-learning for chunk-based caching in DPN. We show that cache policies must be optimized for both network performance during data delivery and processing efficiency during data consumption. Specifically, DeepChunk utilizes a model-free approach by jointly learning limited network, data streaming, and processing statistics at runtime and making cache update decisions under the guidance of powerful deep Q-learning. It enables a joint optimization of multiple objectives including chunk hit ratio, processing stall time, and object download time while being self-adaptive under the time-varying workload and network conditions. We build a prototype implementation of DeepChunk with Ceph, a popular distributed object storage system. Our extensive experiments and evaluation demonstrate significant improvement, i.e., 43% in total reward and 39% in processing stall time, over a number of baseline caching policies.\",\"PeriodicalId\":120479,\"journal\":{\"name\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2019.8919777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2019.8919777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据处理网络(Data Processing Network, DPN)将网络收集和存储的大量数据流式传输到多个处理单元，以便及时计算所需的结果。由于业务量的不断增长，分布式缓存节点可以用于存储热数据并快速交付使用。然而，先前关于缓存策略的工作主要集中在网络性能的潜在收益上，例如缓存命中率和下载延迟，而忽略了缓存对数据处理和消耗的影响。在本文中，我们提出了一个新的框架，DeepChunk，它利用深度q -学习在DPN中进行基于块的缓存。我们表明，缓存策略必须针对数据传递期间的网络性能和数据消费期间的处理效率进行优化。具体来说，DeepChunk采用了一种无模型的方法，在强大的深度q学习的指导下，共同学习有限的网络、数据流和运行时处理统计数据，并做出缓存更新决策。它支持多个目标的联合优化，包括块命中率、处理停顿时间和对象下载时间，同时在时变的工作负载和网络条件下具有自适应能力。我们用Ceph构建了DeepChunk的原型实现，Ceph是一个流行的分布式对象存储系统。我们广泛的实验和评估表明，与许多基准缓存策略相比，有了显著的改进，即总奖励减少了43%，处理停顿时间减少了39%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Q-Learning for Chunk-based Caching in Data Processing Networks

A Data Processing Network (DPN) streams massive volumes of data collected and stored by the network to multiple processing units to compute desired results in a timely fashion. Due to ever-increasing traffic, distributed cache nodes can be deployed to store hot data and rapidly deliver them for consumption. However, prior work on caching policies has primarily focused on the potential gains in network performance, e.g., cache hit ratio and download latency, while neglecting the impact of cache on data processing and consumption.In this paper, we propose a novel framework, DeepChunk, which leverages deep Q-learning for chunk-based caching in DPN. We show that cache policies must be optimized for both network performance during data delivery and processing efficiency during data consumption. Specifically, DeepChunk utilizes a model-free approach by jointly learning limited network, data streaming, and processing statistics at runtime and making cache update decisions under the guidance of powerful deep Q-learning. It enables a joint optimization of multiple objectives including chunk hit ratio, processing stall time, and object download time while being self-adaptive under the time-varying workload and network conditions. We build a prototype implementation of DeepChunk with Ceph, a popular distributed object storage system. Our extensive experiments and evaluation demonstrate significant improvement, i.e., 43% in total reward and 39% in processing stall time, over a number of baseline caching policies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

自引率

0.00%

发文量