Information Fusion最新文献

筛选
英文 中文
A survey on RGB, 3D, and multimodal approaches for unsupervised industrial image anomaly detection
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-03 DOI: 10.1016/j.inffus.2025.103139
Yuxuan Lin , Yang Chang , Xuan Tong , Jiawen Yu , Antonio Liotta , Guofan Huang , Wei Song , Deyu Zeng , Zongze Wu , Yan Wang , Wenqiang Zhang
{"title":"A survey on RGB, 3D, and multimodal approaches for unsupervised industrial image anomaly detection","authors":"Yuxuan Lin ,&nbsp;Yang Chang ,&nbsp;Xuan Tong ,&nbsp;Jiawen Yu ,&nbsp;Antonio Liotta ,&nbsp;Guofan Huang ,&nbsp;Wei Song ,&nbsp;Deyu Zeng ,&nbsp;Zongze Wu ,&nbsp;Yan Wang ,&nbsp;Wenqiang Zhang","doi":"10.1016/j.inffus.2025.103139","DOIUrl":"10.1016/j.inffus.2025.103139","url":null,"abstract":"<div><div>In the advancement of industrial informatization, unsupervised anomaly detection technology effectively overcomes the scarcity of abnormal samples and significantly enhances the automation and reliability of smart manufacturing. As an important branch, industrial image anomaly detection focuses on automatically identifying visual anomalies in industrial scenarios (such as product surface defects, assembly errors, and equipment appearance anomalies) through computer vision techniques. With the rapid development of Unsupervised industrial Image Anomaly Detection (UIAD), excellent detection performance has been achieved not only in RGB setting but also in 3D and multimodal (RGB and 3D) settings. However, existing surveys primarily focus on UIAD tasks in RGB setting, with little discussion in 3D and multimodal settings. To address this gap, this article provides a comprehensive review of UIAD tasks in the three modal settings. Specifically, we first introduce the task concept and process of UIAD. We then overview the research on UIAD in three modal settings (RGB, 3D, and multimodal), including datasets and methods, and review multimodal feature fusion strategies in multimodal setting. Finally, we summarize the main challenges faced by UIAD tasks in the three modal settings, and offer insights into future development directions, aiming to provide researchers with a comprehensive reference and offer new perspectives for the advancement of industrial informatization. Corresponding resources are available at <span><span>https://github.com/Sunny5250/Awesome-Multi-Setting-UIAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103139"},"PeriodicalIF":14.7,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143807214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-Preserving Feature Partitioning for multi-view ensemble learning
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-03 DOI: 10.1016/j.inffus.2025.103152
Mohammad Sadegh Khorshidi , Navid Yazdanjue , Hassan Gharoun , Danial Yazdani , Mohammad Reza Nikoo , Fang Chen , Amir H. Gandomi
{"title":"Semantic-Preserving Feature Partitioning for multi-view ensemble learning","authors":"Mohammad Sadegh Khorshidi ,&nbsp;Navid Yazdanjue ,&nbsp;Hassan Gharoun ,&nbsp;Danial Yazdani ,&nbsp;Mohammad Reza Nikoo ,&nbsp;Fang Chen ,&nbsp;Amir H. Gandomi","doi":"10.1016/j.inffus.2025.103152","DOIUrl":"10.1016/j.inffus.2025.103152","url":null,"abstract":"<div><div>In machine learning, the exponential growth of data and the associated “curse of dimensionality” pose significant challenges, particularly with expansive yet sparse datasets. Addressing these challenges, multi-view ensemble learning (MEL) has emerged as a transformative approach, with feature partitioning (FP) playing a pivotal role in constructing artificial views for MEL. Our study introduces the Semantic-Preserving Feature Partitioning (SPFP) algorithm, a novel method grounded in information theory. The SPFP algorithm partitions datasets into multiple semantically consistent views, enhancing the MEL process. Through extensive experiments on eight real-world datasets, ranging from high-dimensional with limited instances to low-dimensional with high instances, our method demonstrates notable efficacy. It maintains model accuracy while significantly improving uncertainty measures in scenarios where high generalization performance is achievable. Conversely, it retains uncertainty metrics while enhancing accuracy where high generalization accuracy is less attainable. An effect size analysis further reveals that the SPFP algorithm outperforms benchmark models by large effect size and reduces computational demands through effective dimensionality reduction. The substantial effect sizes observed in most experiments underscore the algorithm’s significant improvements in model performance.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103152"},"PeriodicalIF":14.7,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143820445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-end privacy-preserving image retrieval in cloud computing via anti-perturbation attentive token-aware vision transformer
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-03 DOI: 10.1016/j.inffus.2025.103153
Qihua Feng , Zhixun Lu , Chaozhuo Li , Feiran Huang , Jian Weng , Philip S. Yu
{"title":"End-to-end privacy-preserving image retrieval in cloud computing via anti-perturbation attentive token-aware vision transformer","authors":"Qihua Feng ,&nbsp;Zhixun Lu ,&nbsp;Chaozhuo Li ,&nbsp;Feiran Huang ,&nbsp;Jian Weng ,&nbsp;Philip S. Yu","doi":"10.1016/j.inffus.2025.103153","DOIUrl":"10.1016/j.inffus.2025.103153","url":null,"abstract":"<div><div>Privacy-Preserving Image Retrieval (PPIR) has gained popularity among users who upload encrypted personal images to remote servers, enabling image retrieval anytime and anywhere with privacy protection. Existing PPIR suggests extracting features from cipher-images through artificially-designed methods or Convolutional Neural Networks (CNNs). Nonetheless, manual feature engineering entails additional human effort, while CNNs are sensitive to spatial permutations as they primarily manipulate local texture features. To this end, we propose an innovative end-to-end PPIR, which not only eliminates the hassle of manual features but also enables learning expressive cipher-image representations. Specifically, since Vision Transformer (ViT) exhibits excellent robustness against permutation and occlusion in images, we elaborately design an Attentive Token-Aware (ATA) ViT model and hierarchical image block encryptions, which organically complement each other in an end-to-end system. The ATA module effectively learns informative block tokens and pays less attention to trivial and noisy encrypted blocks. Besides, to deal with the problem that the generalization of the model could be hindered by data desert, we adaptively construct the cipher-image augmentations by random block swapping and block erasing, aligning with our encryption operation. Extensive experiments on two datasets validate the superior retrieval accuracy and competitive image privacy protection performance of our proposed scheme.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103153"},"PeriodicalIF":14.7,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing few-sample spatio-temporal prediction via relational fusion-based hypergraph neural network
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-03 DOI: 10.1016/j.inffus.2025.103149
Xiaocao Ouyang , Yanhua Li , Dongyu Guo , Wei Huang , Xin Yang , Yan Yang , Junbo Zhang , Tianrui Li
{"title":"Enhancing few-sample spatio-temporal prediction via relational fusion-based hypergraph neural network","authors":"Xiaocao Ouyang ,&nbsp;Yanhua Li ,&nbsp;Dongyu Guo ,&nbsp;Wei Huang ,&nbsp;Xin Yang ,&nbsp;Yan Yang ,&nbsp;Junbo Zhang ,&nbsp;Tianrui Li","doi":"10.1016/j.inffus.2025.103149","DOIUrl":"10.1016/j.inffus.2025.103149","url":null,"abstract":"<div><div>Spatio-temporal prediction is a pivotal service for smart city applications, such as traffic and air quality prediction. Deep learning models are widely employed for this task, but the effectiveness of existing methods heavily depends on large amounts of data from urban sensors. However, in the early stages of smart city development, data scarcity poses a significant challenge due to the limited data collected from newly deployed sensors. Moreover, transferring data from other resource-rich cities is typically infeasible because of strict privacy policies. To address these challenges, we propose a relational fusion-based hypergraph neural network (RFHGN) for few-sample spatio-temporal prediction. RFHGN is trained directly on limited data within a city, exploiting multiple spatial correlations and hierarchical temporal dependencies to enrich spatio-temporal representations. Specifically, to enhance spatial expressiveness, we design a high-order spatial relation-aware learning module with an adaptive time-varying hypergraph structure. This structure is learned by integrating observational data and is iteratively updated during training, enabling the capture of dynamic high-order interactions. By combining these interactions with pairwise spatial representations, we derive mixed-order spatial representations. To reduce potential redundancy, we introduce a regularized independence loss to ensure the independence of pairwise and high-order spatial representations. Additionally, to effectively capture temporal dependencies at micro and macro levels, we develop a hierarchical temporal relation-aware learning module. Extensive experiments on three spatio-temporal prediction tasks: traffic flow, traffic speed, and air quality prediction demonstrate that RFHGN outperforms state-of-the-art baselines.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103149"},"PeriodicalIF":14.7,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143776756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RSEA-MVGNN: Multi-view graph neural network with reliable structural enhancement and aggregation
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-02 DOI: 10.1016/j.inffus.2025.103143
Junyu Chen , Long Shi , Badong Chen
{"title":"RSEA-MVGNN: Multi-view graph neural network with reliable structural enhancement and aggregation","authors":"Junyu Chen ,&nbsp;Long Shi ,&nbsp;Badong Chen","doi":"10.1016/j.inffus.2025.103143","DOIUrl":"10.1016/j.inffus.2025.103143","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have exhibited remarkable efficacy in learning from multi-view graph data. In the framework of multi-view graph neural networks, a critical challenge lies in effectively combining diverse views, where each view has distinct graph structure features (GSFs). Existing approaches to this challenge primarily focus on two aspects: (1) prioritizing the most important GSFs, (2) utilizing GNNs for feature aggregation. However, prioritizing the most important GSFs can lead to limited feature diversity, and existing GNN-based aggregation strategies process each view without considering view reliability. To address these issues, we propose a novel Multi-View Graph Neural Network with Reliable Structural Enhancement and Aggregation (RSEA-MVGNN). Firstly, we estimate view-specific uncertainty employing subjective logic. Based on this uncertainty, we design a reliable structural enhancement scheme by feature de-correlation algorithm. This approach enables each enhancement to focus on different GSFs, thereby achieving diverse feature representation in the enhanced structure. Secondly, the model learns view-specific beliefs and uncertainty as opinions, which are utilized to evaluate view reliability. Based on these opinions, the model enables high-reliability views to dominate GNN aggregation, thereby facilitating representation learning. Experimental results conducted on five real-world datasets demonstrate that RSEA-MVGNN outperforms several state-of-the-art GNN-based methods. Code is available at <span><span>http://github.com/junyu000/RSEA-MVGNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103143"},"PeriodicalIF":14.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADFusion: Multi-modal adaptive deep fusion for cancer subtype prediction
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-02 DOI: 10.1016/j.inffus.2025.103138
Ziye Zhang, Weixian Huang, Shijin Wang, Kaiwen Tan, Xiaorou Zheng, Shoubin Dong
{"title":"ADFusion: Multi-modal adaptive deep fusion for cancer subtype prediction","authors":"Ziye Zhang,&nbsp;Weixian Huang,&nbsp;Shijin Wang,&nbsp;Kaiwen Tan,&nbsp;Xiaorou Zheng,&nbsp;Shoubin Dong","doi":"10.1016/j.inffus.2025.103138","DOIUrl":"10.1016/j.inffus.2025.103138","url":null,"abstract":"<div><div>The identification of cancer subtypes is crucial for personalized treatment. Subtype prediction can be achieved by using multi-modal data collected from patients. Multi-modal cancer data contains hidden joint information that cannot be adequately tapped by current vector-based fusion methods. To address this, we propose a multi-modal adaptive deep fusion network ADFusion, which utilizes a hierarchical graph convolutional network HiGCN for high-quality representation of multi-modal cancer data. Subsequently, an adaptive deep fusion network based on deep equilibrium theory is designed to capture effectively multi-modal joint information, which is then fused with multi-modal feature vectors to produce the fused features. HiGCN includes co-expressed genes and sample similarity networks, which provide a more nuanced consideration of the relationships between genes, and also between samples, achieving superior representation of multi-modal genes data. Adaptive deep fusion network, with flexible non-fixed layer structure, is designed for mining multi-modal joint information, automatically adjusting its layers according to real-time training conditions, ensuring flexibility and broad applicability. ADFusion was evaluated across 5 public cancer datasets using 3 evaluation metrics, outperforming state-of-arts methods in all results. Additionally, ablation experiments, convergence analysis, and interpretability analysis also demonstrate the performance of ADFusion.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103138"},"PeriodicalIF":14.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vehicle localization in an explainable dynamic Bayesian network framework for self-aware agents
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-02 DOI: 10.1016/j.inffus.2025.103136
Giulia Slavic , Pamela Zontone , Lucio Marcenaro , David Martín Gómez , Carlo Regazzoni
{"title":"Vehicle localization in an explainable dynamic Bayesian network framework for self-aware agents","authors":"Giulia Slavic ,&nbsp;Pamela Zontone ,&nbsp;Lucio Marcenaro ,&nbsp;David Martín Gómez ,&nbsp;Carlo Regazzoni","doi":"10.1016/j.inffus.2025.103136","DOIUrl":"10.1016/j.inffus.2025.103136","url":null,"abstract":"<div><div>This paper proposes a method to perform Visual-Based Localization within an explainable self-awareness framework, by combining deep learning with traditional signal processing methods. Localization, along with anomaly detection, is an important challenge in video surveillance and fault detection. Let us consider, for example, the case of a vehicle patrolling a train station: it must continuously know its location to effectively monitor the surroundings and respond to potential threats. In the proposed method, a Dynamic Bayesian Network model is learned. A vocabulary of clusters is obtained using the odometry and video data, and is employed to guide the training of the video model. As the video model, a combination of a Variational Autoencoder and a Kalman Filter is adopted. In the online phase, a Coupled Markov Jump Particle Filter is proposed for Visual-Based Localization. This filter combines a set of Kalman Filters with a Particle Filter, allowing us to extract possible anomalies in the test scenario as well. The proposed method is integrated into a framework based on awareness theories, and is data-driven, hierarchical, probabilistic, and explainable. The method is evaluated on trajectories from four real-world datasets, i.e., two terrestrial and two aerial. The localization accuracy and explainability of the method are analyzed in detail. We achieve a mean localization accuracy in meters of 1.65, 0.98, 0.23, and 0.87, on the four datasets.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103136"},"PeriodicalIF":14.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143816414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FS-Diff: Semantic guidance and clarity-aware simultaneous multimodal image fusion and super-resolution FS-Diff:语义引导和清晰度感知同步多模态图像融合与超分辨率
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-02 DOI: 10.1016/j.inffus.2025.103146
Yuchan Jie , Yushen Xu , Xiaosong Li , Fuqiang Zhou , Jianming Lv , Huafeng Li
{"title":"FS-Diff: Semantic guidance and clarity-aware simultaneous multimodal image fusion and super-resolution","authors":"Yuchan Jie ,&nbsp;Yushen Xu ,&nbsp;Xiaosong Li ,&nbsp;Fuqiang Zhou ,&nbsp;Jianming Lv ,&nbsp;Huafeng Li","doi":"10.1016/j.inffus.2025.103146","DOIUrl":"10.1016/j.inffus.2025.103146","url":null,"abstract":"<div><div>As an influential information fusion and low-level vision technique, image fusion integrates complementary information from source images to yield an informative fused image. A few attempts have been made in recent years to jointly realize image fusion and super-resolution. However, in real-world applications such as military reconnaissance and long-range detection missions, the target and background structures in multimodal images are easily corrupted, with low resolution and weak semantic information, which leads to suboptimal results in current fusion techniques. In response, we propose FS-Diff, a semantic guidance and clarity-aware joint image fusion and super-resolution method. FS-Diff unifies image fusion and super-resolution as a conditional generation problem. It leverages semantic guidance from the proposed clarity sensing mechanism for adaptive low-resolution perception and cross-modal feature extraction. Specifically, we initialize the desired fused result as pure Gaussian noise and introduce the bidirectional feature Mamba to extract the global features of the multimodal images. Moreover, utilizing the source images and semantics as conditions, we implement a random iterative denoising process via a modified U-Net network. This network istrained for denoising at multiple noise levels to produce high-resolution fusion results with cross-modal features and abundant semantic information. We also construct a powerful aerial view multiscene (AVMS) benchmark covering 600 pairs of images. Extensive joint image fusion and super-resolution experiments on six public and our AVMS datasets demonstrated that FS-Diff outperforms the state-of-the-art methods at multiple magnifications and can recover richer details and semantics in the fused images. The code is available at <span><span>https://github.com/XylonXu01/FS-Diff</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103146"},"PeriodicalIF":14.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143776755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Instruction-driven fusion of Infrared-visible images: Tailoring for diverse downstream tasks
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-02 DOI: 10.1016/j.inffus.2025.103148
Zengyi Yang , Yafei Zhang , Huafeng Li , Yu Liu
{"title":"Instruction-driven fusion of Infrared-visible images: Tailoring for diverse downstream tasks","authors":"Zengyi Yang ,&nbsp;Yafei Zhang ,&nbsp;Huafeng Li ,&nbsp;Yu Liu","doi":"10.1016/j.inffus.2025.103148","DOIUrl":"10.1016/j.inffus.2025.103148","url":null,"abstract":"<div><div>The primary value of infrared and visible image fusion technology lies in applying the fusion results to downstream tasks. However, existing methods face challenges such as increased training complexity and significantly compromised performance of individual tasks when addressing multiple downstream tasks simultaneously. To tackle this, we propose Task-Oriented Adaptive Regulation (T-OAR), an adaptive mechanism specifically designed for multi-task environments. Additionally, we introduce the Task-related Dynamic Prompt Injection (T-DPI) module, which generates task-specific dynamic prompts from user-input text instructions and integrates them into target representations. This guides the feature extraction module to produce representations that are more closely aligned with the specific requirements of downstream tasks. By incorporating the T-DPI module into the T-OAR framework, our approach generates fusion images tailored to task-specific requirements without the need for separate training or task-specific weights. This not only reduces computational costs but also enhances adaptability and performance across multiple tasks. Experimental results show that our method excels in object detection, semantic segmentation, and salient object detection, demonstrating its strong adaptability, flexibility, and task specificity. This provides an efficient solution for image fusion in multi-task environments, highlighting the technology’s potential across diverse applications. The source code is available at <span><span>https://github.com/YR0211/IDF-TDDT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103148"},"PeriodicalIF":14.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parameter-oriented contrastive schema and multi-level knowledge distillation for heterogeneous federated learning
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-01 DOI: 10.1016/j.inffus.2025.103123
Lele Fu , Yuecheng Li , Sheng Huang , Chuan Chen , Chuanfu Zhang , Zibin Zheng
{"title":"Parameter-oriented contrastive schema and multi-level knowledge distillation for heterogeneous federated learning","authors":"Lele Fu ,&nbsp;Yuecheng Li ,&nbsp;Sheng Huang ,&nbsp;Chuan Chen ,&nbsp;Chuanfu Zhang ,&nbsp;Zibin Zheng","doi":"10.1016/j.inffus.2025.103123","DOIUrl":"10.1016/j.inffus.2025.103123","url":null,"abstract":"<div><div>Federated learning aims to unite multiple data owners to collaboratively train a machine learning model without leaking the private data. However, the non-independent identically distributed (Non-IID) data differentiates the optimization directions of different clients, thus seriously impairing the performance of global model. Most efforts handling the data heterogeneity focus on the server or client side, adopting certain strategies to mitigate the differences of local models. These single-side solutions are limited in addressing the negative impact of heterogeneous data. In this paper, we attempt to overcome the problem of heterogenous federated learning simultaneously from dual sides. Specifically, to prevent the catastrophical forgetting of global information, we devise a parameter-oriented contrastive schema for correcting the optimization directions of local models on the client-side. Furthermore, considering that the only average of very diverse network parameters might damage the structural information, a multi-level knowledge distillation manner to repair the corrupt information of the global model is performed on the server-side. A multitude of experiments on four benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art federated learning approaches on the Non-IID data.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"121 ","pages":"Article 103123"},"PeriodicalIF":14.7,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143759324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信