iFlatLFS: Performance optimization for accessing massive small files

Songling Fu, Chenlin Huang, Ligang He, Nadeem Chaudhary, Xiangke Liao, Shazhou Yang, Xiaochuan Wang, Bao Li
{"title":"iFlatLFS: Performance optimization for accessing massive small files","authors":"Songling Fu, Chenlin Huang, Ligang He, Nadeem Chaudhary, Xiangke Liao, Shazhou Yang, Xiaochuan Wang, Bao Li","doi":"10.1109/HiPC.2013.6799116","DOIUrl":null,"url":null,"abstract":"The processing of massive small files is a challenge in the design of distributed file systems. Currently, the combined-block-storage approach is prevalent. However, the approach employs traditional file systems like ExtFS and may cause inefficiency for random access to small files. This paper focuses on optimizing the performance of data servers in accessing massive small files. We present a Flat Lightweight File System (iFlatLFS) to manage small files, which is based on a simple metadata scheme and a flat storage architecture. iFlatLFS aims to substitute the traditional file system on data servers that are mainly used to store small files, and it can greatly simplify the original data access procedure. The new metadata proposed in this paper occupies only a fraction of the original metadata size based on traditional file systems. We have implemented iFlatLFS in CentOS 5.5 and integrated it into an open source Distributed File System (DFS), called Taobao FileSystem (TFS), which is developed by a top B2C service provider, Alibaba, in China and is managing over 28.6 billion small photos. We have conducted extensive experiments to verify the performance of iFlatLFS. The results show that when the file size ranges from 1KB to 64KB, iFlatLFS is faster than Ext4 by 48% and 54% on average for random read and write in the DFS environment, respectively. Moreover, after iFlatLFS is integrated into TFS, iFlatLFS-based TFS is faster than the existing Ext4-based TFS by 45% and 49% on average for random read access and hybrid access (the mix of read and write accesses), respectively.","PeriodicalId":206307,"journal":{"name":"20th Annual International Conference on High Performance Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"20th Annual International Conference on High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2013.6799116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The processing of massive small files is a challenge in the design of distributed file systems. Currently, the combined-block-storage approach is prevalent. However, the approach employs traditional file systems like ExtFS and may cause inefficiency for random access to small files. This paper focuses on optimizing the performance of data servers in accessing massive small files. We present a Flat Lightweight File System (iFlatLFS) to manage small files, which is based on a simple metadata scheme and a flat storage architecture. iFlatLFS aims to substitute the traditional file system on data servers that are mainly used to store small files, and it can greatly simplify the original data access procedure. The new metadata proposed in this paper occupies only a fraction of the original metadata size based on traditional file systems. We have implemented iFlatLFS in CentOS 5.5 and integrated it into an open source Distributed File System (DFS), called Taobao FileSystem (TFS), which is developed by a top B2C service provider, Alibaba, in China and is managing over 28.6 billion small photos. We have conducted extensive experiments to verify the performance of iFlatLFS. The results show that when the file size ranges from 1KB to 64KB, iFlatLFS is faster than Ext4 by 48% and 54% on average for random read and write in the DFS environment, respectively. Moreover, after iFlatLFS is integrated into TFS, iFlatLFS-based TFS is faster than the existing Ext4-based TFS by 45% and 49% on average for random read access and hybrid access (the mix of read and write accesses), respectively.
iFlatLFS:访问大量小文件的性能优化
海量小文件的处理是分布式文件系统设计中的一个难题。目前,组合块存储方法非常流行。但是,该方法使用传统的文件系统,如ExtFS,可能会导致对小文件的随机访问效率低下。本文主要研究数据服务器在访问海量小文件时的性能优化问题。本文提出了一种基于简单元数据方案和平面存储架构的扁平轻量级文件系统(iFlatLFS)来管理小文件。iFlatLFS旨在取代数据服务器上主要用于存储小文件的传统文件系统,它可以大大简化原有的数据访问过程。本文提出的新元数据只占用基于传统文件系统的原始元数据大小的一小部分。我们在CentOS 5.5中实现了iFlatLFS,并将其集成到一个开源的分布式文件系统(DFS)中,称为淘宝文件系统(TFS),该系统由中国顶级B2C服务提供商阿里巴巴开发,管理着超过286亿张小照片。我们已经进行了大量的实验来验证iFlatLFS的性能。结果表明,当文件大小在1KB到64KB之间时,iFlatLFS在DFS环境中随机读写的速度分别比Ext4快48%和54%。此外,在将iFlatLFS集成到TFS后,基于iFlatLFS的TFS在随机读访问和混合访问(读写混合访问)方面分别比现有的基于ext4的TFS快45%和49%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信