{"title":"Performance Optimization of In-Memory File System in Distributed Storage System","authors":"Zhaowei Li, Yunlong Yan, Jintao Mo, Zhaocong Wen, Junmin Wu","doi":"10.1109/NAS.2017.8026870","DOIUrl":null,"url":null,"abstract":"Hadoop as an open source framework for dealing with Big Data can be processed to calculate large amounts of data in parallel, which has attracted more and more attention in academia and industry. This paper analyzes the methods of In-Memory File System using HDFS Lazy Persist strategy and Alluxio to upgrade system I/O efficiency. Besides, in order to avoid the problem that Lazy Persist strategy needs to be triggered manually each time, we propose HDFS Lazy Persist strategy automatic trigger mechanism based on the statistics of data access information.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Networking, Architecture, and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2017.8026870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Hadoop as an open source framework for dealing with Big Data can be processed to calculate large amounts of data in parallel, which has attracted more and more attention in academia and industry. This paper analyzes the methods of In-Memory File System using HDFS Lazy Persist strategy and Alluxio to upgrade system I/O efficiency. Besides, in order to avoid the problem that Lazy Persist strategy needs to be triggered manually each time, we propose HDFS Lazy Persist strategy automatic trigger mechanism based on the statistics of data access information.