Jianlong Xu, Mengqing Jin, Jinze Xiao, Dianming Lin, Yuelong Liu
{"title":"Multi-round decentralized dataset distillation with federated learning for Low Earth Orbit satellite communication","authors":"Jianlong Xu, Mengqing Jin, Jinze Xiao, Dianming Lin, Yuelong Liu","doi":"10.1016/j.future.2024.107570","DOIUrl":null,"url":null,"abstract":"<div><div>Satellite communication and Low Earth Orbit (LEO) satellites are important components of the 6G network, widely used for Earth observation tasks due to their low cost and short return period, making them a key technology for 6G network connectivity. Due to limitations in satellite system technology and downlink bandwidth, it is not feasible to download all high-resolution image information to ground stations. Even in existing federated learning (FL) methods, sharing well-trained parts of the model can still bottleneck with increasing model size. To address these challenges, we propose a new federated learning framework (FL-M3D) for LEO satellite communication that employs multi-round decentralized dataset distillation techniques. It allows satellites to independently extract local datasets and transmit them to ground stations instead of exchanging model parameters. Communication costs depend only on the size of the synthesized dataset and do not increase with larger models. However, the heterogeneity of satellite datasets can lead to sample ambiguity and decreased model convergence speed. Therefore, we propose distilling the datasets to mitigate the negative effects of data heterogeneity. Through experiments using real-world image datasets, FL-M3D reduces communication volume in simulated satellite networks by approximately 49.84% and achieves improved model performance.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X2400534X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Satellite communication and Low Earth Orbit (LEO) satellites are important components of the 6G network, widely used for Earth observation tasks due to their low cost and short return period, making them a key technology for 6G network connectivity. Due to limitations in satellite system technology and downlink bandwidth, it is not feasible to download all high-resolution image information to ground stations. Even in existing federated learning (FL) methods, sharing well-trained parts of the model can still bottleneck with increasing model size. To address these challenges, we propose a new federated learning framework (FL-M3D) for LEO satellite communication that employs multi-round decentralized dataset distillation techniques. It allows satellites to independently extract local datasets and transmit them to ground stations instead of exchanging model parameters. Communication costs depend only on the size of the synthesized dataset and do not increase with larger models. However, the heterogeneity of satellite datasets can lead to sample ambiguity and decreased model convergence speed. Therefore, we propose distilling the datasets to mitigate the negative effects of data heterogeneity. Through experiments using real-world image datasets, FL-M3D reduces communication volume in simulated satellite networks by approximately 49.84% and achieves improved model performance.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.