Embracing knowledge integration from the vision-language model for federated domain generalization on multi-source fused data

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-09-16 DOI:10.1016/j.inffus.2025.103714

Zhenyu Liu , Heye Zhang , Yiwen Wang , Zhifan Gao

{"title":"Embracing knowledge integration from the vision-language model for federated domain generalization on multi-source fused data","authors":"Zhenyu Liu , Heye Zhang , Yiwen Wang , Zhifan Gao","doi":"10.1016/j.inffus.2025.103714","DOIUrl":null,"url":null,"abstract":"<div><div>Federated Domain Generalization (FedDG) has attracted attention for its potential to enable privacy-preserving fusion of multi-source data. It aims to develop a global model in a distributed manner that generalizes to unseen clients. However, it faces the challenge of the tradeoff between inter-client and intra-client domain shifts. Knowledge distillation from the vision-language model may address this challenge by transferring its zero-shot generalization ability to client models. However, it may suffer from distribution discrepancies between the pretraining data of the vision-language model and the downstream data. Although pre-distillation fine-tuning may alleviate this issue in centralized settings, it may not be compatible with FedDG. In this paper, we introduce an in-distillation selective adaptation framework for FedDG. It selectively fine-tunes unreliable outputs while directly distilling reliable ones from the vision-language model, effectively using knowledge distillation to address the challenge in FedDG. Furthermore, we propose a federated energy-driven reliability appraisal (FedReap) method to support this framework by appraising the reliability of outputs from the vision-language model. It includes hypersphere-constraint energy construction and label-guided energy partition. These two processes enable FedReap to acquire reliable and unreliable outputs for direct distillation and adaptation. In addition, FedReap employs a dual-level distillation strategy and a dual-stage adaptation strategy for distillation and adaptation. Extensive experiments on five datasets demonstrate the effectiveness of FedReap compared to twelve state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103714"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525007717","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Federated Domain Generalization (FedDG) has attracted attention for its potential to enable privacy-preserving fusion of multi-source data. It aims to develop a global model in a distributed manner that generalizes to unseen clients. However, it faces the challenge of the tradeoff between inter-client and intra-client domain shifts. Knowledge distillation from the vision-language model may address this challenge by transferring its zero-shot generalization ability to client models. However, it may suffer from distribution discrepancies between the pretraining data of the vision-language model and the downstream data. Although pre-distillation fine-tuning may alleviate this issue in centralized settings, it may not be compatible with FedDG. In this paper, we introduce an in-distillation selective adaptation framework for FedDG. It selectively fine-tunes unreliable outputs while directly distilling reliable ones from the vision-language model, effectively using knowledge distillation to address the challenge in FedDG. Furthermore, we propose a federated energy-driven reliability appraisal (FedReap) method to support this framework by appraising the reliability of outputs from the vision-language model. It includes hypersphere-constraint energy construction and label-guided energy partition. These two processes enable FedReap to acquire reliable and unreliable outputs for direct distillation and adaptation. In addition, FedReap employs a dual-level distillation strategy and a dual-stage adaptation strategy for distillation and adaptation. Extensive experiments on five datasets demonstrate the effectiveness of FedReap compared to twelve state-of-the-art methods.

查看原文本刊更多论文

采用基于视觉语言模型的知识集成，对多源融合数据进行联合领域泛化

联邦域泛化（federal Domain Generalization, FedDG）因其具有实现多源数据隐私保护融合的潜力而备受关注。它旨在以分布式的方式开发一个全局模型，以推广到看不见的客户端。然而，它面临着客户端之间和客户端内部域转移之间权衡的挑战。来自视觉语言模型的知识蒸馏可以通过将其零概率泛化能力转移到客户模型来解决这一挑战。然而，视觉语言模型的预训练数据与下游数据之间可能存在分布差异。虽然预蒸馏微调可以在集中设置中缓解这个问题，但它可能与FedDG不兼容。在本文中，我们引入了FedDG的蒸馏中选择性自适应框架。它有选择地微调不可靠的输出，同时直接从视觉语言模型中提取可靠的输出，有效地使用知识蒸馏来解决FedDG中的挑战。此外，我们提出了一种联邦能量驱动可靠性评估（FedReap）方法，通过评估视觉语言模型输出的可靠性来支持该框架。它包括超球约束能量构造和标签引导能量划分。这两个过程使FedReap能够获得直接蒸馏和适应的可靠和不可靠的输出。此外，FedReap采用双级蒸馏策略和双阶段自适应策略进行蒸馏和自适应。在5个数据集上进行的大量实验证明了FedReap与12种最先进的方法相比的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.