{"title":"基于分布式文件系统和大规模并行计算的航空航天大数据基础设施与应用","authors":"Fan Xu, Bin Yin, Ming-Zhu Zhang, Xue Wang","doi":"10.1109/CCET55412.2022.9906364","DOIUrl":null,"url":null,"abstract":"As the aerospace business growing rapidly, data flow and volume has exploded in recent years, bringing chances and challenges to big data infrastructures and applications in this field. In traditional aerospace data and application centers, data is stored in network attached storages(NAS) and processed by sequential or low level parallel programs, which can hardly meet the demand of performance, availability and scalability. In this paper, we provided a big data infrastructure based on HDFS for big data centers, which can improve the availability and scalability remarkably. Besides, we gather a series of typical big data applications in aerospace filed as benchmarks, analyzes their characteristics and accelerates them in MapReduce framework. The experiment result shows that among all the benchmarks, the speedup is 4.98 to the peak and 3.87 on the average.","PeriodicalId":329327,"journal":{"name":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Aerospace Big Data Infrastructure and Applications with Distributed File System and Massive Parallel Calculation\",\"authors\":\"Fan Xu, Bin Yin, Ming-Zhu Zhang, Xue Wang\",\"doi\":\"10.1109/CCET55412.2022.9906364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the aerospace business growing rapidly, data flow and volume has exploded in recent years, bringing chances and challenges to big data infrastructures and applications in this field. In traditional aerospace data and application centers, data is stored in network attached storages(NAS) and processed by sequential or low level parallel programs, which can hardly meet the demand of performance, availability and scalability. In this paper, we provided a big data infrastructure based on HDFS for big data centers, which can improve the availability and scalability remarkably. Besides, we gather a series of typical big data applications in aerospace filed as benchmarks, analyzes their characteristics and accelerates them in MapReduce framework. The experiment result shows that among all the benchmarks, the speedup is 4.98 to the peak and 3.87 on the average.\",\"PeriodicalId\":329327,\"journal\":{\"name\":\"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCET55412.2022.9906364\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET55412.2022.9906364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Aerospace Big Data Infrastructure and Applications with Distributed File System and Massive Parallel Calculation
As the aerospace business growing rapidly, data flow and volume has exploded in recent years, bringing chances and challenges to big data infrastructures and applications in this field. In traditional aerospace data and application centers, data is stored in network attached storages(NAS) and processed by sequential or low level parallel programs, which can hardly meet the demand of performance, availability and scalability. In this paper, we provided a big data infrastructure based on HDFS for big data centers, which can improve the availability and scalability remarkably. Besides, we gather a series of typical big data applications in aerospace filed as benchmarks, analyzes their characteristics and accelerates them in MapReduce framework. The experiment result shows that among all the benchmarks, the speedup is 4.98 to the peak and 3.87 on the average.