通过有效利用服务器端知识和客户端无把握样本,提升半监督联合学习能力

IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Hongquan Liu , Yuxi Mi , Yateng Tang , Jihong Guan , Shuigeng Zhou
{"title":"通过有效利用服务器端知识和客户端无把握样本,提升半监督联合学习能力","authors":"Hongquan Liu ,&nbsp;Yuxi Mi ,&nbsp;Yateng Tang ,&nbsp;Jihong Guan ,&nbsp;Shuigeng Zhou","doi":"10.1016/j.neunet.2025.107440","DOIUrl":null,"url":null,"abstract":"<div><div>Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the <em>class proxies</em> of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107440"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boosting semi-supervised federated learning by effectively exploiting server-side knowledge and client-side unconfident samples\",\"authors\":\"Hongquan Liu ,&nbsp;Yuxi Mi ,&nbsp;Yateng Tang ,&nbsp;Jihong Guan ,&nbsp;Shuigeng Zhou\",\"doi\":\"10.1016/j.neunet.2025.107440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the <em>class proxies</em> of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107440\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003193\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003193","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

半监督联邦学习(SSFL)已成为一种有前途的范例,以减少训练联邦学习(FL)模型对完全标记数据的需求。本文关注的是服务器上的标签场景,其中客户端的数据完全没有标记,服务器只拥有有限数量的标记数据。在这种情况下,非独立同分布(non-IID)的局部数据和不正确的伪标签可能会在局部训练过程中给模型引入偏差。先前的工作试图通过使用干净的标记数据微调全局模型来减轻偏见,而忽略了明确地利用服务器端知识来指导局部训练。此外,现有方法通常丢弃具有不可信伪标签的样本,导致许多样本未被使用,从而导致次优性能和缓慢的收敛。本文介绍了一种通过有效利用服务器端干净知识和客户端不自信样本来提高SSFL性能的新方法。具体来说,我们提出了一个表示对齐模块,通过将本地特征与服务器标记数据的类代理对齐来减轻非iid数据的影响。此外,我们使用收缩损失来降低与不可靠伪标签相关的风险,确保利用包含在整个未标记数据集中的有价值信息。在5个不同设置的基准数据集上进行的大量实验表明,该方法的有效性和通用性不仅优于现有方法,而且降低了实现目标性能所需的通信成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Boosting semi-supervised federated learning by effectively exploiting server-side knowledge and client-side unconfident samples
Semi-supervised federated learning (SSFL) has emerged as a promising paradigm to reduce the need for fully labeled data in training federated learning (FL) models. This paper focuses on the label-at-server scenario, where clients’ data are entirely unlabeled and the server possesses only a limited amount of labeled data. In this setting, the non-independent and identically distributed (non-IID) local data and the incorrect pseudo-labels will possibly introduce bias into the model during local training. Prior works try to alleviate the bias by fine-tuning the global model with clean labeled data, ignoring explicitly leveraging server-side knowledge to guide local training. Additionally, existing methods typically discard samples with unconfident pseudo-labels, resulting in many samples being not used, consequently suboptimal performance and slow convergence. This paper introduces a novel method to enhance SSFL performance by effectively exploiting server-side clean knowledge and client-side unconfident samples. Specifically, we propose a representation alignment module that mitigates the influence of non-IID data by aligning local features with the class proxies of the server labeled data. Furthermore, we employ a shrink loss to reduce the risk associated with unreliable pseudo-labels, ensuring the exploitation of valuable information contained in the entire unlabeled dataset. Extensive experiments on five benchmark datasets under various settings demonstrate the effectiveness and generality of the proposed method, which not only outperforms existing methods but also reduces the communication cost required to achieve the target performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Networks
Neural Networks 工程技术-计算机:人工智能
CiteScore
13.90
自引率
7.70%
发文量
425
审稿时长
67 days
期刊介绍: Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信