优化非固定环境中带有延迟命中的缓存的延迟

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Performance Evaluation Pub Date : 2025-05-20 DOI:10.1016/j.peva.2025.102488

Bowen Jiang, Yubo Yang, Bo Jiang

{"title":"优化非固定环境中带有延迟命中的缓存的延迟","authors":"Bowen Jiang, Yubo Yang, Bo Jiang","doi":"10.1016/j.peva.2025.102488","DOIUrl":null,"url":null,"abstract":"<div><div>Caching plays a crucial role in many latency-sensitive systems, including content delivery networks, edge computing, and microprocessors. As the ratio between system throughput and transmission latency increases, delayed hits in cache problems become more prominent. In real-world scenarios, object access patterns often exhibit a non-stationary nature. In this paper, we investigate the latency optimization problem for caching with delayed hits in a non-stationary environment, where object sizes and fetching latencies are both non-uniform. We first find that given known future arrivals, evicting the object with the larger size, a higher aggregate delay due to miss and arriving the farthest in the future brings more gains in reducing latency. Following our findings, we design an online learning framework to make cache decisions more effectively. The first component of this framework utilizes historical data within the training window to estimate the object’s non-stationary arrival process, modeled as a mixture of log-gaussian distributions. Subsequently, we predict future arrivals based on this estimated distribution. According to these predicted future arrivals, we can determine the priority of eviction candidates using our defined rank function. Experimental results on four real-world traces show that our algorithm consistently reduces latency by <span><math><mrow><mn>2</mn><mtext>%</mtext><mo>−</mo><mn>10</mn><mtext>%</mtext></mrow></math></span> on average compared to state-of-the-art algorithms.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"169 ","pages":"Article 102488"},"PeriodicalIF":0.8000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing latency for caching with delayed hits in non-stationary environment\",\"authors\":\"Bowen Jiang, Yubo Yang, Bo Jiang\",\"doi\":\"10.1016/j.peva.2025.102488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Caching plays a crucial role in many latency-sensitive systems, including content delivery networks, edge computing, and microprocessors. As the ratio between system throughput and transmission latency increases, delayed hits in cache problems become more prominent. In real-world scenarios, object access patterns often exhibit a non-stationary nature. In this paper, we investigate the latency optimization problem for caching with delayed hits in a non-stationary environment, where object sizes and fetching latencies are both non-uniform. We first find that given known future arrivals, evicting the object with the larger size, a higher aggregate delay due to miss and arriving the farthest in the future brings more gains in reducing latency. Following our findings, we design an online learning framework to make cache decisions more effectively. The first component of this framework utilizes historical data within the training window to estimate the object’s non-stationary arrival process, modeled as a mixture of log-gaussian distributions. Subsequently, we predict future arrivals based on this estimated distribution. According to these predicted future arrivals, we can determine the priority of eviction candidates using our defined rank function. Experimental results on four real-world traces show that our algorithm consistently reduces latency by <span><math><mrow><mn>2</mn><mtext>%</mtext><mo>−</mo><mn>10</mn><mtext>%</mtext></mrow></math></span> on average compared to state-of-the-art algorithms.</div></div>\",\"PeriodicalId\":19964,\"journal\":{\"name\":\"Performance Evaluation\",\"volume\":\"169 \",\"pages\":\"Article 102488\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2025-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Performance Evaluation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0166531625000227\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Performance Evaluation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166531625000227","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

缓存在许多对延迟敏感的系统中起着至关重要的作用，包括内容交付网络、边缘计算和微处理器。随着系统吞吐量与传输延迟之比的增大，缓存中的延迟命中问题变得更加突出。在实际场景中，对象访问模式通常表现出非固定的性质。在本文中，我们研究了在对象大小和获取延迟都不均匀的非固定环境下，具有延迟命中的缓存的延迟优化问题。我们首先发现，给定已知的未来到达，驱逐具有较大尺寸的对象，由于错过和到达最远的未来而导致的更高的总延迟可以在减少延迟方面获得更多收益。根据我们的发现，我们设计了一个在线学习框架来更有效地做出缓存决策。该框架的第一个组成部分利用训练窗口内的历史数据来估计目标的非平稳到达过程，建模为对数高斯分布的混合。随后，我们根据这一估计分布预测未来的到达人数。根据这些预测的未来到达，我们可以使用我们定义的排名函数确定驱逐候选人的优先级。在四个真实世界轨迹上的实验结果表明，与最先进的算法相比，我们的算法平均可以减少2% - 10%的延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimizing latency for caching with delayed hits in non-stationary environment

Caching plays a crucial role in many latency-sensitive systems, including content delivery networks, edge computing, and microprocessors. As the ratio between system throughput and transmission latency increases, delayed hits in cache problems become more prominent. In real-world scenarios, object access patterns often exhibit a non-stationary nature. In this paper, we investigate the latency optimization problem for caching with delayed hits in a non-stationary environment, where object sizes and fetching latencies are both non-uniform. We first find that given known future arrivals, evicting the object with the larger size, a higher aggregate delay due to miss and arriving the farthest in the future brings more gains in reducing latency. Following our findings, we design an online learning framework to make cache decisions more effectively. The first component of this framework utilizes historical data within the training window to estimate the object’s non-stationary arrival process, modeled as a mixture of log-gaussian distributions. Subsequently, we predict future arrivals based on this estimated distribution. According to these predicted future arrivals, we can determine the priority of eviction candidates using our defined rank function. Experimental results on four real-world traces show that our algorithm consistently reduces latency by

2 % - 10 %

on average compared to state-of-the-art algorithms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Performance Evaluation 工程技术-计算机：理论方法

CiteScore

3.10

自引率

0.00%

发文量

审稿时长

24 days

期刊介绍： Performance Evaluation functions as a leading journal in the area of modeling, measurement, and evaluation of performance aspects of computing and communication systems. As such, it aims to present a balanced and complete view of the entire Performance Evaluation profession. Hence, the journal is interested in papers that focus on one or more of the following dimensions: -Define new performance evaluation tools, including measurement and monitoring tools as well as modeling and analytic techniques -Provide new insights into the performance of computing and communication systems -Introduce new application areas where performance evaluation tools can play an important role and creative new uses for performance evaluation tools. More specifically, common application areas of interest include the performance of: -Resource allocation and control methods and algorithms (e.g. routing and flow control in networks, bandwidth allocation, processor scheduling, memory management) -System architecture, design and implementation -Cognitive radio -VANETs -Social networks and media -Energy efficient ICT -Energy harvesting -Data centers -Data centric networks -System reliability -System tuning and capacity planning -Wireless and sensor networks -Autonomic and self-organizing systems -Embedded systems -Network science