{"title":"用于低地轨道超大型星座网络机载分布式处理的大数据应用模拟平台设计","authors":"Zhikai Zhang, Shushi Gu, Zhang Qinyu, Jiayin Xue","doi":"10.23919/JCC.ja.2022-0617","DOIUrl":null,"url":null,"abstract":"Due to the restricted satellite payloads in LEO mega-constellation networks (LMCNs), remote sensing image analysis, online learning and other big data services desirably need onboard distributed processing (OBDP). In existing technologies, the efficiency of big data applications (BDAs) in distributed systems hinges on the stable-state and low-latency links between worker nodes. However, LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions, which makes the performance of OBDP hard to be intuitively measured. To bridge this gap, a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing. Using STK's APIs and parallel computing framework, we achieve real-time simulation for thousands of satellite nodes, which are mapped as application nodes through software defined network (SDN) and container technologies. We elaborate the architecture and mechanism of the simulation platform, and take the Starlink and Hadoop as realistic examples for simulations. The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement. Compared to ground data center networks (GDCNs), LMCNs deteriorate the computing and storage job throughput, which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.","PeriodicalId":504777,"journal":{"name":"China Communications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Big data application simulation platform design for onboard distributed processing of LEO mega-constellation networks\",\"authors\":\"Zhikai Zhang, Shushi Gu, Zhang Qinyu, Jiayin Xue\",\"doi\":\"10.23919/JCC.ja.2022-0617\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the restricted satellite payloads in LEO mega-constellation networks (LMCNs), remote sensing image analysis, online learning and other big data services desirably need onboard distributed processing (OBDP). In existing technologies, the efficiency of big data applications (BDAs) in distributed systems hinges on the stable-state and low-latency links between worker nodes. However, LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions, which makes the performance of OBDP hard to be intuitively measured. To bridge this gap, a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing. Using STK's APIs and parallel computing framework, we achieve real-time simulation for thousands of satellite nodes, which are mapped as application nodes through software defined network (SDN) and container technologies. We elaborate the architecture and mechanism of the simulation platform, and take the Starlink and Hadoop as realistic examples for simulations. The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement. Compared to ground data center networks (GDCNs), LMCNs deteriorate the computing and storage job throughput, which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.\",\"PeriodicalId\":504777,\"journal\":{\"name\":\"China Communications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"China Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/JCC.ja.2022-0617\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"China Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/JCC.ja.2022-0617","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Big data application simulation platform design for onboard distributed processing of LEO mega-constellation networks
Due to the restricted satellite payloads in LEO mega-constellation networks (LMCNs), remote sensing image analysis, online learning and other big data services desirably need onboard distributed processing (OBDP). In existing technologies, the efficiency of big data applications (BDAs) in distributed systems hinges on the stable-state and low-latency links between worker nodes. However, LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions, which makes the performance of OBDP hard to be intuitively measured. To bridge this gap, a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing. Using STK's APIs and parallel computing framework, we achieve real-time simulation for thousands of satellite nodes, which are mapped as application nodes through software defined network (SDN) and container technologies. We elaborate the architecture and mechanism of the simulation platform, and take the Starlink and Hadoop as realistic examples for simulations. The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement. Compared to ground data center networks (GDCNs), LMCNs deteriorate the computing and storage job throughput, which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.