利用对比实例相似性和动态平衡池改进自监督垂直联邦学习

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Shuai Chen , Wenyu Zhang , Xiaoling Huang , Cheng Zhang , Qingjun Mao
{"title":"利用对比实例相似性和动态平衡池改进自监督垂直联邦学习","authors":"Shuai Chen ,&nbsp;Wenyu Zhang ,&nbsp;Xiaoling Huang ,&nbsp;Cheng Zhang ,&nbsp;Qingjun Mao","doi":"10.1016/j.future.2025.107884","DOIUrl":null,"url":null,"abstract":"<div><div>Vertical Federated Learning (VFL) enables multiple parties with distinct feature spaces to train a joint VFL model collaboratively without exposing their original private data. In realistic scenarios, the scarcity of aligned and labeled samples among collaborating participants limits the effectiveness of traditional VFL approaches for model training. Current VFL frameworks attempt to leverage abundant unlabeled data using Contrastive Self-Supervised Learning (CSSL). However, the simplistic incorporation of CSSL methods cannot address severe domain shift in VFL. In addition, CSSL methods typically conflict with general regularization approaches designed to alleviate domain shift, thereby significantly limiting the potential of the self-supervised learning framework in VFL. To address these challenges, this study proposes an Improved Self-Supervised Vertical Federated Learning (ISSVFL) framework for VFL in label-scarce scenarios under the semi-honest and no-collusion assumption. ISSVFL merges CSSL with instance-wise similarity to resolve regularization conflicts and captures more significant inter-domain knowledge in the representations from different participants, effectively alleviating domain shift. In addition, a new dynamical balance pool is proposed to fine-tune the pre-trained models for downstream supervised tasks by dynamically balancing inter-domain and intra-domain knowledge. Extensive empirical experiments on image and tabular datasets demonstrate that ISSVFL achieves an average performance improvement of 3.3 % compared with state-of-the-art baselines.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"172 ","pages":"Article 107884"},"PeriodicalIF":6.2000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving self-supervised vertical federated learning with contrastive instance-wise similarity and dynamical balance pool\",\"authors\":\"Shuai Chen ,&nbsp;Wenyu Zhang ,&nbsp;Xiaoling Huang ,&nbsp;Cheng Zhang ,&nbsp;Qingjun Mao\",\"doi\":\"10.1016/j.future.2025.107884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Vertical Federated Learning (VFL) enables multiple parties with distinct feature spaces to train a joint VFL model collaboratively without exposing their original private data. In realistic scenarios, the scarcity of aligned and labeled samples among collaborating participants limits the effectiveness of traditional VFL approaches for model training. Current VFL frameworks attempt to leverage abundant unlabeled data using Contrastive Self-Supervised Learning (CSSL). However, the simplistic incorporation of CSSL methods cannot address severe domain shift in VFL. In addition, CSSL methods typically conflict with general regularization approaches designed to alleviate domain shift, thereby significantly limiting the potential of the self-supervised learning framework in VFL. To address these challenges, this study proposes an Improved Self-Supervised Vertical Federated Learning (ISSVFL) framework for VFL in label-scarce scenarios under the semi-honest and no-collusion assumption. ISSVFL merges CSSL with instance-wise similarity to resolve regularization conflicts and captures more significant inter-domain knowledge in the representations from different participants, effectively alleviating domain shift. In addition, a new dynamical balance pool is proposed to fine-tune the pre-trained models for downstream supervised tasks by dynamically balancing inter-domain and intra-domain knowledge. Extensive empirical experiments on image and tabular datasets demonstrate that ISSVFL achieves an average performance improvement of 3.3 % compared with state-of-the-art baselines.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"172 \",\"pages\":\"Article 107884\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25001797\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25001797","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

垂直联邦学习(VFL)使具有不同特征空间的多方能够在不暴露其原始私有数据的情况下协作训练联合VFL模型。在现实场景中,协作参与者之间对齐和标记样本的稀缺性限制了传统VFL方法用于模型训练的有效性。当前的VFL框架尝试使用对比自监督学习(CSSL)来利用大量未标记的数据。然而,简单地结合CSSL方法不能解决VFL中严重的域移位问题。此外,CSSL方法通常与旨在缓解域移位的一般正则化方法相冲突,从而极大地限制了自监督学习框架在VFL中的潜力。为了解决这些挑战,本研究提出了一种改进的自监督垂直联邦学习(ISSVFL)框架,用于半诚实和无勾结假设下标签稀缺场景下的VFL。ISSVFL将CSSL与实例相似性相结合,解决了正则化冲突,并在不同参与者的表示中捕获了更重要的领域间知识,有效缓解了领域转移。此外,提出了一个新的动态平衡池,通过动态平衡域间和域内知识来微调下游监督任务的预训练模型。在图像和表格数据集上进行的大量实证实验表明,与最先进的基线相比,ISSVFL的平均性能提高了3.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving self-supervised vertical federated learning with contrastive instance-wise similarity and dynamical balance pool
Vertical Federated Learning (VFL) enables multiple parties with distinct feature spaces to train a joint VFL model collaboratively without exposing their original private data. In realistic scenarios, the scarcity of aligned and labeled samples among collaborating participants limits the effectiveness of traditional VFL approaches for model training. Current VFL frameworks attempt to leverage abundant unlabeled data using Contrastive Self-Supervised Learning (CSSL). However, the simplistic incorporation of CSSL methods cannot address severe domain shift in VFL. In addition, CSSL methods typically conflict with general regularization approaches designed to alleviate domain shift, thereby significantly limiting the potential of the self-supervised learning framework in VFL. To address these challenges, this study proposes an Improved Self-Supervised Vertical Federated Learning (ISSVFL) framework for VFL in label-scarce scenarios under the semi-honest and no-collusion assumption. ISSVFL merges CSSL with instance-wise similarity to resolve regularization conflicts and captures more significant inter-domain knowledge in the representations from different participants, effectively alleviating domain shift. In addition, a new dynamical balance pool is proposed to fine-tune the pre-trained models for downstream supervised tasks by dynamically balancing inter-domain and intra-domain knowledge. Extensive empirical experiments on image and tabular datasets demonstrate that ISSVFL achieves an average performance improvement of 3.3 % compared with state-of-the-art baselines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信