Hongquan Liu , Yuxi Mi , Yateng Tang , Jihong Guan , Shuigeng Zhou
{"title":"通过有效利用服务器端知识和客户端无把握样本,提升半监督联合学习能力","authors":"Hongquan Liu , Yuxi Mi , Yateng Tang , Jihong Guan , Shuigeng Zhou","doi":"10.1016/j.neunet.2025.107440","DOIUrl":null,"url":null,"abstract":"<div><div>Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the <em>class proxies</em> of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107440"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boosting semi-supervised federated learning by effectively exploiting server-side knowledge and client-side unconfident samples\",\"authors\":\"Hongquan Liu , Yuxi Mi , Yateng Tang , Jihong Guan , Shuigeng Zhou\",\"doi\":\"10.1016/j.neunet.2025.107440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the <em>class proxies</em> of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107440\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003193\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003193","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Boosting semi-supervised federated learning by effectively exploiting server-side knowledge and client-side unconfident samples
Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the class proxies of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.