{"title":"Improving Aerospace Big Data Infrastructure and Applications with Distributed File System and Massive Parallel Calculation","authors":"Fan Xu, Bin Yin, Ming-Zhu Zhang, Xue Wang","doi":"10.1109/CCET55412.2022.9906364","DOIUrl":null,"url":null,"abstract":"As the aerospace business growing rapidly, data flow and volume has exploded in recent years, bringing chances and challenges to big data infrastructures and applications in this field. In traditional aerospace data and application centers, data is stored in network attached storages(NAS) and processed by sequential or low level parallel programs, which can hardly meet the demand of performance, availability and scalability. In this paper, we provided a big data infrastructure based on HDFS for big data centers, which can improve the availability and scalability remarkably. Besides, we gather a series of typical big data applications in aerospace filed as benchmarks, analyzes their characteristics and accelerates them in MapReduce framework. The experiment result shows that among all the benchmarks, the speedup is 4.98 to the peak and 3.87 on the average.","PeriodicalId":329327,"journal":{"name":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET55412.2022.9906364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As the aerospace business growing rapidly, data flow and volume has exploded in recent years, bringing chances and challenges to big data infrastructures and applications in this field. In traditional aerospace data and application centers, data is stored in network attached storages(NAS) and processed by sequential or low level parallel programs, which can hardly meet the demand of performance, availability and scalability. In this paper, we provided a big data infrastructure based on HDFS for big data centers, which can improve the availability and scalability remarkably. Besides, we gather a series of typical big data applications in aerospace filed as benchmarks, analyzes their characteristics and accelerates them in MapReduce framework. The experiment result shows that among all the benchmarks, the speedup is 4.98 to the peak and 3.87 on the average.