Tao Fan;Hanlin Gu;Xuemei Cao;Chee Seng Chan;Qian Chen;Yiqiang Chen;Yihui Feng;Yang Gu;Jiaxiang Geng;Bing Luo;Shuoling Liu;Win Kent Ong;Chao Ren;Jiaqi Shao;Chuan Sun;Xiaoli Tang;Hong Xi Tae;Yongxin Tong;Shuyue Wei;Fan Wu;Wei Xi;Mingcong Xu;He Yang;Xin Yang;Jiangpeng Yan;Hao Yu;Han Yu;Teng Zhang;Yifei Zhang;Xiaojin Zhang;Zhenzhe Zheng;Lixin Fan;Qiang Yang
{"title":"联邦基础模型中的十个挑战问题","authors":"Tao Fan;Hanlin Gu;Xuemei Cao;Chee Seng Chan;Qian Chen;Yiqiang Chen;Yihui Feng;Yang Gu;Jiaxiang Geng;Bing Luo;Shuoling Liu;Win Kent Ong;Chao Ren;Jiaqi Shao;Chuan Sun;Xiaoli Tang;Hong Xi Tae;Yongxin Tong;Shuyue Wei;Fan Wu;Wei Xi;Mingcong Xu;He Yang;Xin Yang;Jiangpeng Yan;Hao Yu;Han Yu;Teng Zhang;Yifei Zhang;Xiaojin Zhang;Zhenzhe Zheng;Lixin Fan;Qiang Yang","doi":"10.1109/TKDE.2025.3555328","DOIUrl":null,"url":null,"abstract":"Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: “Foundational Theory,” which aims to establish a coherent and unifying theoretical framework for FedFMs. “Data,” addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; “Heterogeneity,” examining variations in data, model, and computational resources across clients; “Security and Privacy,” focusing on defenses against malicious attacks and model theft; and “Efficiency,” highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4314-4337"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ten Challenging Problems in Federated Foundation Models\",\"authors\":\"Tao Fan;Hanlin Gu;Xuemei Cao;Chee Seng Chan;Qian Chen;Yiqiang Chen;Yihui Feng;Yang Gu;Jiaxiang Geng;Bing Luo;Shuoling Liu;Win Kent Ong;Chao Ren;Jiaqi Shao;Chuan Sun;Xiaoli Tang;Hong Xi Tae;Yongxin Tong;Shuyue Wei;Fan Wu;Wei Xi;Mingcong Xu;He Yang;Xin Yang;Jiangpeng Yan;Hao Yu;Han Yu;Teng Zhang;Yifei Zhang;Xiaojin Zhang;Zhenzhe Zheng;Lixin Fan;Qiang Yang\",\"doi\":\"10.1109/TKDE.2025.3555328\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: “Foundational Theory,” which aims to establish a coherent and unifying theoretical framework for FedFMs. “Data,” addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; “Heterogeneity,” examining variations in data, model, and computational resources across clients; “Security and Privacy,” focusing on defenses against malicious attacks and model theft; and “Efficiency,” highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"37 7\",\"pages\":\"4314-4337\"},\"PeriodicalIF\":10.4000,\"publicationDate\":\"2025-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10944288/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10944288/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Ten Challenging Problems in Federated Foundation Models
Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: “Foundational Theory,” which aims to establish a coherent and unifying theoretical framework for FedFMs. “Data,” addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; “Heterogeneity,” examining variations in data, model, and computational resources across clients; “Security and Privacy,” focusing on defenses against malicious attacks and model theft; and “Efficiency,” highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.