{"title":"通过全局引导伪标记的联合半监督学习:标签稀缺场景的鲁棒方法","authors":"Yuan Xi, Qiong Li, Haokun Mao","doi":"10.1016/j.eswa.2025.128667","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Semi-Supervised Learning (FSSL) is a powerful paradigm for collaboratively training models on both labeled and unlabeled datasets, which is adopted in domains such as healthcare and IoT. However, heterogeneous data distributions and imbalanced labeling capabilities both lead to significant prediction bias across participating clients, further resulting in skewed pseudo-labels during the local training stage. Most existing FSSL studies address the bias by improving model consistency, which relies on a well-trained benchmark derived from the fully labeled client, and encounters challenges in label-scarce scenarios. In this paper, we propose a novel FSSL method, namely Federated Globally Guided pseudo-labeling (FedGGp), suitable for both label-scarce and Non-Independent and Identically Distributed (Non-IID) scenarios. Specifically, this strategy summarizes the prediction bias assessments based on skewed class predictions, and modifies pseudo-labeling indicators accordingly in the subsequent iteration. For advantageous classes, FedGGp employs adaptive thresholds to generate high-quality pseudo-labels, while for discriminated classes, it expands the number of pseudo-labels to ensure balanced model training. Moreover, soft consistency regularization is applied to broaden the boundary of pseudo-labels for some underrepresented classes, which are typically ambiguous during classifications. The experimental results on four different datasets demonstrate that FedGGp outperforms various state-of-the-art methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"294 ","pages":"Article 128667"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated semi-supervised learning via globally guided pseudo-labeling: A robust approach for label-scarce scenarios\",\"authors\":\"Yuan Xi, Qiong Li, Haokun Mao\",\"doi\":\"10.1016/j.eswa.2025.128667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Federated Semi-Supervised Learning (FSSL) is a powerful paradigm for collaboratively training models on both labeled and unlabeled datasets, which is adopted in domains such as healthcare and IoT. However, heterogeneous data distributions and imbalanced labeling capabilities both lead to significant prediction bias across participating clients, further resulting in skewed pseudo-labels during the local training stage. Most existing FSSL studies address the bias by improving model consistency, which relies on a well-trained benchmark derived from the fully labeled client, and encounters challenges in label-scarce scenarios. In this paper, we propose a novel FSSL method, namely Federated Globally Guided pseudo-labeling (FedGGp), suitable for both label-scarce and Non-Independent and Identically Distributed (Non-IID) scenarios. Specifically, this strategy summarizes the prediction bias assessments based on skewed class predictions, and modifies pseudo-labeling indicators accordingly in the subsequent iteration. For advantageous classes, FedGGp employs adaptive thresholds to generate high-quality pseudo-labels, while for discriminated classes, it expands the number of pseudo-labels to ensure balanced model training. Moreover, soft consistency regularization is applied to broaden the boundary of pseudo-labels for some underrepresented classes, which are typically ambiguous during classifications. The experimental results on four different datasets demonstrate that FedGGp outperforms various state-of-the-art methods.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"294 \",\"pages\":\"Article 128667\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425022857\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425022857","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Federated semi-supervised learning via globally guided pseudo-labeling: A robust approach for label-scarce scenarios
Federated Semi-Supervised Learning (FSSL) is a powerful paradigm for collaboratively training models on both labeled and unlabeled datasets, which is adopted in domains such as healthcare and IoT. However, heterogeneous data distributions and imbalanced labeling capabilities both lead to significant prediction bias across participating clients, further resulting in skewed pseudo-labels during the local training stage. Most existing FSSL studies address the bias by improving model consistency, which relies on a well-trained benchmark derived from the fully labeled client, and encounters challenges in label-scarce scenarios. In this paper, we propose a novel FSSL method, namely Federated Globally Guided pseudo-labeling (FedGGp), suitable for both label-scarce and Non-Independent and Identically Distributed (Non-IID) scenarios. Specifically, this strategy summarizes the prediction bias assessments based on skewed class predictions, and modifies pseudo-labeling indicators accordingly in the subsequent iteration. For advantageous classes, FedGGp employs adaptive thresholds to generate high-quality pseudo-labels, while for discriminated classes, it expands the number of pseudo-labels to ensure balanced model training. Moreover, soft consistency regularization is applied to broaden the boundary of pseudo-labels for some underrepresented classes, which are typically ambiguous during classifications. The experimental results on four different datasets demonstrate that FedGGp outperforms various state-of-the-art methods.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.