为对象存储系统中的擦除编码方案制定基准:系统回顾

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Jannatun Noor , Rezuana Imtiaz Upoma , Md. Sadiqul Islam Sakif , A.B.M. Alim Al Islam
{"title":"为对象存储系统中的擦除编码方案制定基准:系统回顾","authors":"Jannatun Noor ,&nbsp;Rezuana Imtiaz Upoma ,&nbsp;Md. Sadiqul Islam Sakif ,&nbsp;A.B.M. Alim Al Islam","doi":"10.1016/j.future.2024.107522","DOIUrl":null,"url":null,"abstract":"<div><p>Erasure Coding (EC) in cloud storage minimizes data replication by reconstructing data from parity fragments. This method enhances data redundancy and efficiency while reducing storage costs and improving fault tolerance. It is more advantageous than replication in Object Storage Systems. EC guarantees data integrity by ensuring lossless transmission of all coded pieces. As data volumes continue to increase rapidly, the time efficiency of the EC method becomes crucial in ensuring optimal system performance. Various variables, including the algorithm employed, data size, number of storage nodes, hardware resources, and network conditions, can influence the speed of EC operations. Although some literature covers various aspects, there is still a research gap in understanding the I/O activities, time efficiency, and fault tolerance of EC in object storage systems. Hence, our research aims to address these challenges in cloud-based object storage systems. We analyze and benchmark the data storage I/O performance of OpenStack Swift, focusing on the time efficiency of the Reed–Solomon (RS) algorithm across two datasets. Additionally, our contributions include benchmarking EC performance in both local and remote testbeds, utilizing the SimEDC simulator for comprehensive efficiency and fault tolerance assessments. Moreover, we create a comprehensive dataset (MCSD-100) for benchmarking and conduct a systematic literature review. Finally, we identify and discuss future opportunities for enhancing EC in cloud-based object storage systems.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"163 ","pages":"Article 107522"},"PeriodicalIF":6.2000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards benchmarking erasure coding schemes in object storage system: A systematic review\",\"authors\":\"Jannatun Noor ,&nbsp;Rezuana Imtiaz Upoma ,&nbsp;Md. Sadiqul Islam Sakif ,&nbsp;A.B.M. Alim Al Islam\",\"doi\":\"10.1016/j.future.2024.107522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Erasure Coding (EC) in cloud storage minimizes data replication by reconstructing data from parity fragments. This method enhances data redundancy and efficiency while reducing storage costs and improving fault tolerance. It is more advantageous than replication in Object Storage Systems. EC guarantees data integrity by ensuring lossless transmission of all coded pieces. As data volumes continue to increase rapidly, the time efficiency of the EC method becomes crucial in ensuring optimal system performance. Various variables, including the algorithm employed, data size, number of storage nodes, hardware resources, and network conditions, can influence the speed of EC operations. Although some literature covers various aspects, there is still a research gap in understanding the I/O activities, time efficiency, and fault tolerance of EC in object storage systems. Hence, our research aims to address these challenges in cloud-based object storage systems. We analyze and benchmark the data storage I/O performance of OpenStack Swift, focusing on the time efficiency of the Reed–Solomon (RS) algorithm across two datasets. Additionally, our contributions include benchmarking EC performance in both local and remote testbeds, utilizing the SimEDC simulator for comprehensive efficiency and fault tolerance assessments. Moreover, we create a comprehensive dataset (MCSD-100) for benchmarking and conduct a systematic literature review. Finally, we identify and discuss future opportunities for enhancing EC in cloud-based object storage systems.</p></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"163 \",\"pages\":\"Article 107522\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X24004862\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004862","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

云存储中的擦除编码(EC)通过从奇偶校验片段重建数据,最大限度地减少了数据复制。这种方法可提高数据冗余度和效率,同时降低存储成本并提高容错性。它比对象存储系统中的复制更具优势。EC 通过确保无损传输所有编码片段来保证数据的完整性。随着数据量的持续快速增长,EC 方法的时间效率对确保最佳系统性能至关重要。各种变量,包括采用的算法、数据大小、存储节点数量、硬件资源和网络条件,都会影响 EC 的运行速度。虽然一些文献涉及各个方面,但在了解对象存储系统中 EC 的 I/O 活动、时间效率和容错性方面仍存在研究空白。因此,我们的研究旨在应对基于云的对象存储系统中的这些挑战。我们对 OpenStack Swift 的数据存储 I/O 性能进行了分析和基准测试,重点关注两个数据集中里德-所罗门(RS)算法的时间效率。此外,我们的贡献还包括在本地和远程测试平台上对 EC 性能进行基准测试,利用 SimEDC 模拟器进行全面的效率和容错评估。此外,我们还创建了一个用于基准测试的综合数据集(MCSD-100),并进行了系统的文献综述。最后,我们确定并讨论了在基于云的对象存储系统中增强 EC 的未来机遇。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards benchmarking erasure coding schemes in object storage system: A systematic review

Erasure Coding (EC) in cloud storage minimizes data replication by reconstructing data from parity fragments. This method enhances data redundancy and efficiency while reducing storage costs and improving fault tolerance. It is more advantageous than replication in Object Storage Systems. EC guarantees data integrity by ensuring lossless transmission of all coded pieces. As data volumes continue to increase rapidly, the time efficiency of the EC method becomes crucial in ensuring optimal system performance. Various variables, including the algorithm employed, data size, number of storage nodes, hardware resources, and network conditions, can influence the speed of EC operations. Although some literature covers various aspects, there is still a research gap in understanding the I/O activities, time efficiency, and fault tolerance of EC in object storage systems. Hence, our research aims to address these challenges in cloud-based object storage systems. We analyze and benchmark the data storage I/O performance of OpenStack Swift, focusing on the time efficiency of the Reed–Solomon (RS) algorithm across two datasets. Additionally, our contributions include benchmarking EC performance in both local and remote testbeds, utilizing the SimEDC simulator for comprehensive efficiency and fault tolerance assessments. Moreover, we create a comprehensive dataset (MCSD-100) for benchmarking and conduct a systematic literature review. Finally, we identify and discuss future opportunities for enhancing EC in cloud-based object storage systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信