Kubestorage: A Cloud Native Storage Engine for Massive Small Files

Fuxin Liu, Jingwei Li, Yihong Wang, Lin Li
{"title":"Kubestorage: A Cloud Native Storage Engine for Massive Small Files","authors":"Fuxin Liu, Jingwei Li, Yihong Wang, Lin Li","doi":"10.1109/BESC48373.2019.8962995","DOIUrl":null,"url":null,"abstract":"Cloud Native, the emerging computing infrastructure has become a new trend for cloud computing, especially after the development of containerization technology such as docker and LXD, and the orchestration system for them like Kubernetes and Swarm. With the growing popularity of Cloud Native, the following problems have been raised: (i) most Cloud Native applications were designed for making full use of the cloud platform, but their file storage has not been completely optimized for adapting it. (ii) the traditional file system is designed as a utility for storing and retrieving files, usually built into the kernel of the operating systems. But when placing it to a large-scale condition, like a network storage server shared by thousands of computing instances, and stores millions of files, it will be slow and even unstable. (iii) most storage solutions use metadata for faster tracking of files, but the metadata itself will take up a lot of space, and the capacity of it is usually limited. If the file system store metadata directly into hard disk without caching, the tracking of massive small files will be a lot slower. (iv) The traditional object storage solution can't provide enough features to make itself more practical on the cloud such as caching and auto replication. This paper proposes a new storage engine based on the well-known Haystack storage engine, optimized in terms of service discovery and Automated fault tolerance, make it more suitable for Cloud Native infrastructure, deployment and applications. We use the object storage model to solve the large and high-frequency file storage needs, offering a simple and unified set of APIs for application to access. We also take advantage of Kubernetes' sophisticated and automated toolchains to make cloud storage easier to deploy, more flexible to scale, and more stable to run.","PeriodicalId":190867,"journal":{"name":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Behavioral, Economic and Socio-Cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC48373.2019.8962995","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Cloud Native, the emerging computing infrastructure has become a new trend for cloud computing, especially after the development of containerization technology such as docker and LXD, and the orchestration system for them like Kubernetes and Swarm. With the growing popularity of Cloud Native, the following problems have been raised: (i) most Cloud Native applications were designed for making full use of the cloud platform, but their file storage has not been completely optimized for adapting it. (ii) the traditional file system is designed as a utility for storing and retrieving files, usually built into the kernel of the operating systems. But when placing it to a large-scale condition, like a network storage server shared by thousands of computing instances, and stores millions of files, it will be slow and even unstable. (iii) most storage solutions use metadata for faster tracking of files, but the metadata itself will take up a lot of space, and the capacity of it is usually limited. If the file system store metadata directly into hard disk without caching, the tracking of massive small files will be a lot slower. (iv) The traditional object storage solution can't provide enough features to make itself more practical on the cloud such as caching and auto replication. This paper proposes a new storage engine based on the well-known Haystack storage engine, optimized in terms of service discovery and Automated fault tolerance, make it more suitable for Cloud Native infrastructure, deployment and applications. We use the object storage model to solve the large and high-frequency file storage needs, offering a simple and unified set of APIs for application to access. We also take advantage of Kubernetes' sophisticated and automated toolchains to make cloud storage easier to deploy, more flexible to scale, and more stable to run.
Kubestorage:一个用于海量小文件的云原生存储引擎
Cloud Native,这个新兴的计算基础设施已经成为云计算的新趋势,特别是在docker和LXD等容器化技术以及Kubernetes和Swarm等为它们服务的编排系统发展之后。随着Cloud Native的日益普及,出现了以下问题:(1)大多数Cloud Native应用程序都是为了充分利用云平台而设计的,但它们的文件存储并没有完全优化以适应云平台。(ii)传统的文件系统被设计为存储和检索文件的实用程序,通常内置在操作系统的内核中。但是,当将其放置在大规模的条件下,例如由数千个计算实例共享的网络存储服务器,并存储数百万个文件时,它将变得缓慢甚至不稳定。(iii)大多数存储方案使用元数据来更快地跟踪文件,但元数据本身会占用大量空间,而且容量通常是有限的。如果文件系统将元数据直接存储到硬盘中而不进行缓存,那么跟踪大量小文件的速度将会慢得多。(iv)传统的对象存储解决方案不能提供足够的功能,使其在云上更加实用,如缓存和自动复制。本文在著名的Haystack存储引擎的基础上,提出了一种新的存储引擎,在服务发现和自动化容错方面进行了优化,使其更适合云原生基础设施、部署和应用。我们使用对象存储模型来解决海量、高频的文件存储需求,提供一套简单统一的api供应用程序访问。我们还利用Kubernetes复杂和自动化的工具链,使云存储更容易部署,更灵活地扩展,更稳定地运行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信