{"title":"Optimizing latency for caching with delayed hits in non-stationary environment","authors":"Bowen Jiang, Yubo Yang, Bo Jiang","doi":"10.1016/j.peva.2025.102488","DOIUrl":null,"url":null,"abstract":"<div><div>Caching plays a crucial role in many latency-sensitive systems, including content delivery networks, edge computing, and microprocessors. As the ratio between system throughput and transmission latency increases, delayed hits in cache problems become more prominent. In real-world scenarios, object access patterns often exhibit a non-stationary nature. In this paper, we investigate the latency optimization problem for caching with delayed hits in a non-stationary environment, where object sizes and fetching latencies are both non-uniform. We first find that given known future arrivals, evicting the object with the larger size, a higher aggregate delay due to miss and arriving the farthest in the future brings more gains in reducing latency. Following our findings, we design an online learning framework to make cache decisions more effectively. The first component of this framework utilizes historical data within the training window to estimate the object’s non-stationary arrival process, modeled as a mixture of log-gaussian distributions. Subsequently, we predict future arrivals based on this estimated distribution. According to these predicted future arrivals, we can determine the priority of eviction candidates using our defined rank function. Experimental results on four real-world traces show that our algorithm consistently reduces latency by <span><math><mrow><mn>2</mn><mtext>%</mtext><mo>−</mo><mn>10</mn><mtext>%</mtext></mrow></math></span> on average compared to state-of-the-art algorithms.</div></div>","PeriodicalId":19964,"journal":{"name":"Performance Evaluation","volume":"169 ","pages":"Article 102488"},"PeriodicalIF":1.0000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Performance Evaluation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166531625000227","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Caching plays a crucial role in many latency-sensitive systems, including content delivery networks, edge computing, and microprocessors. As the ratio between system throughput and transmission latency increases, delayed hits in cache problems become more prominent. In real-world scenarios, object access patterns often exhibit a non-stationary nature. In this paper, we investigate the latency optimization problem for caching with delayed hits in a non-stationary environment, where object sizes and fetching latencies are both non-uniform. We first find that given known future arrivals, evicting the object with the larger size, a higher aggregate delay due to miss and arriving the farthest in the future brings more gains in reducing latency. Following our findings, we design an online learning framework to make cache decisions more effectively. The first component of this framework utilizes historical data within the training window to estimate the object’s non-stationary arrival process, modeled as a mixture of log-gaussian distributions. Subsequently, we predict future arrivals based on this estimated distribution. According to these predicted future arrivals, we can determine the priority of eviction candidates using our defined rank function. Experimental results on four real-world traces show that our algorithm consistently reduces latency by on average compared to state-of-the-art algorithms.
期刊介绍:
Performance Evaluation functions as a leading journal in the area of modeling, measurement, and evaluation of performance aspects of computing and communication systems. As such, it aims to present a balanced and complete view of the entire Performance Evaluation profession. Hence, the journal is interested in papers that focus on one or more of the following dimensions:
-Define new performance evaluation tools, including measurement and monitoring tools as well as modeling and analytic techniques
-Provide new insights into the performance of computing and communication systems
-Introduce new application areas where performance evaluation tools can play an important role and creative new uses for performance evaluation tools.
More specifically, common application areas of interest include the performance of:
-Resource allocation and control methods and algorithms (e.g. routing and flow control in networks, bandwidth allocation, processor scheduling, memory management)
-System architecture, design and implementation
-Cognitive radio
-VANETs
-Social networks and media
-Energy efficient ICT
-Energy harvesting
-Data centers
-Data centric networks
-System reliability
-System tuning and capacity planning
-Wireless and sensor networks
-Autonomic and self-organizing systems
-Embedded systems
-Network science