Xuzhe Liu , Yuchong Hu , Weichun Wang , Dan Feng , Hai Zhou
{"title":"在内存键值存储中优化宽条最小带宽再生码的编码和修复","authors":"Xuzhe Liu , Yuchong Hu , Weichun Wang , Dan Feng , Hai Zhou","doi":"10.1016/j.sysarc.2025.103369","DOIUrl":null,"url":null,"abstract":"<div><div>In-memory key–value (KV) stores are essential for databases and large-scale websites. While recent studies deploy wide-stripe erasure coding in such systems to ensure data reliability and achieve extreme storage savings, they also introduce a repair penalty. A class of erasure codes, Minimum Bandwidth Regenerating (MBR) codes, offers optimal single-chunk repair bandwidth. However, deploying wide-stripe MBR codes in this context results in two types of additional traffic: (i) <em>encoding traffic</em> incurred by transmitting large amounts of raw data between nodes; (ii) <em>repair traffic</em> from retrieving unnecessary data to repair failed data.</div><div>This paper proposes MBRWide to optimize encoding and repair performance for wide-stripe MBR codes in in-memory KV stores. MBRWide includes an <em>all-node cooperative encoding scheme</em> (ACES) and a <em>fragmented repair scheme</em> (FRS). ACES selectively encodes raw chunks to reduce encoding traffic. FRS aims to enhance repair efficiency by dynamically fragmenting parity chunks during encoding. This study implements MBRWide in Memcached, a foundational component in real-world in-memory KV services. Experiments show that ACES improves encoding throughput by 16.02% to 72.92% compared to traditional encoding methods. FRS reduces degraded read latency to failed data and multiple failures repair latency by up to 34.19% and 44.89%, respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"161 ","pages":"Article 103369"},"PeriodicalIF":3.7000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing encoding and repair for wide-stripe minimum bandwidth regenerating codes in in-memory key-value stores\",\"authors\":\"Xuzhe Liu , Yuchong Hu , Weichun Wang , Dan Feng , Hai Zhou\",\"doi\":\"10.1016/j.sysarc.2025.103369\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In-memory key–value (KV) stores are essential for databases and large-scale websites. While recent studies deploy wide-stripe erasure coding in such systems to ensure data reliability and achieve extreme storage savings, they also introduce a repair penalty. A class of erasure codes, Minimum Bandwidth Regenerating (MBR) codes, offers optimal single-chunk repair bandwidth. However, deploying wide-stripe MBR codes in this context results in two types of additional traffic: (i) <em>encoding traffic</em> incurred by transmitting large amounts of raw data between nodes; (ii) <em>repair traffic</em> from retrieving unnecessary data to repair failed data.</div><div>This paper proposes MBRWide to optimize encoding and repair performance for wide-stripe MBR codes in in-memory KV stores. MBRWide includes an <em>all-node cooperative encoding scheme</em> (ACES) and a <em>fragmented repair scheme</em> (FRS). ACES selectively encodes raw chunks to reduce encoding traffic. FRS aims to enhance repair efficiency by dynamically fragmenting parity chunks during encoding. This study implements MBRWide in Memcached, a foundational component in real-world in-memory KV services. Experiments show that ACES improves encoding throughput by 16.02% to 72.92% compared to traditional encoding methods. FRS reduces degraded read latency to failed data and multiple failures repair latency by up to 34.19% and 44.89%, respectively.</div></div>\",\"PeriodicalId\":50027,\"journal\":{\"name\":\"Journal of Systems Architecture\",\"volume\":\"161 \",\"pages\":\"Article 103369\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems Architecture\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1383762125000414\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125000414","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Optimizing encoding and repair for wide-stripe minimum bandwidth regenerating codes in in-memory key-value stores
In-memory key–value (KV) stores are essential for databases and large-scale websites. While recent studies deploy wide-stripe erasure coding in such systems to ensure data reliability and achieve extreme storage savings, they also introduce a repair penalty. A class of erasure codes, Minimum Bandwidth Regenerating (MBR) codes, offers optimal single-chunk repair bandwidth. However, deploying wide-stripe MBR codes in this context results in two types of additional traffic: (i) encoding traffic incurred by transmitting large amounts of raw data between nodes; (ii) repair traffic from retrieving unnecessary data to repair failed data.
This paper proposes MBRWide to optimize encoding and repair performance for wide-stripe MBR codes in in-memory KV stores. MBRWide includes an all-node cooperative encoding scheme (ACES) and a fragmented repair scheme (FRS). ACES selectively encodes raw chunks to reduce encoding traffic. FRS aims to enhance repair efficiency by dynamically fragmenting parity chunks during encoding. This study implements MBRWide in Memcached, a foundational component in real-world in-memory KV services. Experiments show that ACES improves encoding throughput by 16.02% to 72.92% compared to traditional encoding methods. FRS reduces degraded read latency to failed data and multiple failures repair latency by up to 34.19% and 44.89%, respectively.
期刊介绍:
The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software.
Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.