Juan Aznar Poveda , Maximilian Franz Ebner , Thomas Fahringer , Zahra Najafabadi Samani , Marlon Etheredge , Stefan Pedratscher , Nishant Saurabh
{"title":"SmartKV: A cost-effective and low-latency geo-distributed key-value store for the computing continuum","authors":"Juan Aznar Poveda , Maximilian Franz Ebner , Thomas Fahringer , Zahra Najafabadi Samani , Marlon Etheredge , Stefan Pedratscher , Nishant Saurabh","doi":"10.1016/j.future.2025.107857","DOIUrl":null,"url":null,"abstract":"<div><div>Many data-intensive and distributed applications rely on low-latency and scalable key–value storage systems across the Computing Continuum. Key–value storage systems typically use consistent hashing or hash slot-sharding mechanisms to distribute data across storage nodes, which ensures load balancing but often leads to sub-optimal response times and monetary costs, particularly in geo-distributed systems where nodes might have different unit prices and be widely dispersed. In this paper, we propose <span>SmartKV</span>, a cost-efficient geo-distributed key–value store that optimizes data placement dynamically, abstracting the intricacies of data organization, transfer, access, and processing. <span>SmartKV</span> integrates a decentralized data placement algorithm that optimizes the replication factor and selects suitable locations for key–value pairs and replicas, balancing cost and access latency while keeping optimization overhead low. We employ a realistic cost model based on public and private Cloud and Edge providers that consider data transfer, request, and storage costs. In addition to conventional key–value pairs, <span>SmartKV</span> supports active key–value pairs, which enable the definition of custom data types and the execution of user-defined functions directly on the storage side. This contributes to reducing data transfer costs and round-trip times. We thoroughly evaluate <span>SmartKV</span> across different regions of the Chameleon testbed using several realistic workloads. Results show that the utilized decentralized data placement strategy allows <span>SmartKV</span> to reduce round trip times between 9 and 84% while reducing costs up to 4.84<span><math><mo>×</mo></math></span> under different client workloads and consistency models compared to state-of-the-art data placement strategies.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"171 ","pages":"Article 107857"},"PeriodicalIF":6.2000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001529","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Many data-intensive and distributed applications rely on low-latency and scalable key–value storage systems across the Computing Continuum. Key–value storage systems typically use consistent hashing or hash slot-sharding mechanisms to distribute data across storage nodes, which ensures load balancing but often leads to sub-optimal response times and monetary costs, particularly in geo-distributed systems where nodes might have different unit prices and be widely dispersed. In this paper, we propose SmartKV, a cost-efficient geo-distributed key–value store that optimizes data placement dynamically, abstracting the intricacies of data organization, transfer, access, and processing. SmartKV integrates a decentralized data placement algorithm that optimizes the replication factor and selects suitable locations for key–value pairs and replicas, balancing cost and access latency while keeping optimization overhead low. We employ a realistic cost model based on public and private Cloud and Edge providers that consider data transfer, request, and storage costs. In addition to conventional key–value pairs, SmartKV supports active key–value pairs, which enable the definition of custom data types and the execution of user-defined functions directly on the storage side. This contributes to reducing data transfer costs and round-trip times. We thoroughly evaluate SmartKV across different regions of the Chameleon testbed using several realistic workloads. Results show that the utilized decentralized data placement strategy allows SmartKV to reduce round trip times between 9 and 84% while reducing costs up to 4.84 under different client workloads and consistency models compared to state-of-the-art data placement strategies.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.