Information Fusion最新文献

筛选
英文 中文
Dual-Driven Cross-Modal Contrastive Hashing Retrieval Network Via Structural Feature and Semantic Information 基于结构特征和语义信息的双驱动跨模态对比哈希检索网络
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-30 DOI: 10.1016/j.inffus.2025.103252
Cheng Huang , Wenzhe Liu , Jinghua Wang , Jinrong Cui , Jie Wen
{"title":"Dual-Driven Cross-Modal Contrastive Hashing Retrieval Network Via Structural Feature and Semantic Information","authors":"Cheng Huang ,&nbsp;Wenzhe Liu ,&nbsp;Jinghua Wang ,&nbsp;Jinrong Cui ,&nbsp;Jie Wen","doi":"10.1016/j.inffus.2025.103252","DOIUrl":"10.1016/j.inffus.2025.103252","url":null,"abstract":"<div><div>The contrastive-based cross-modal hashing retrieval network, which is widely acknowledged for its exceptional performance in binary hash code learning, has garnered significant recognition in the field. However, there remain three issues that worth further investigation, including: (1) How to capture the structural features among intra-modal data and efficiently utilize them for subsequent hash code representation learning; (2) How to promote intra-modal learning and enhance the robustness of the resulting intra-model features, which are equally important as the inter-modal features; (3) How to effectively harness the semantic information to guide the hash code learning process. In response to above issues, this paper proposes a method called <strong>D</strong>ual-<strong>D</strong>riven Cross-Modal Contrastive Hashing Retrieval Network via <strong>S</strong>tructural Feature and <strong>S</strong>emantic Information (DDSS), which consists of three components. Firstly, DDSS extracts visual-modal and textual-modal features via Contrastive Language-Image Pre-training (CLIP) and takes them as the input for cross-modal hashing retrieval. Secondly, DDSS uses a Dual Branch Feature Learning Module to learn both structural features and self-attention features. Through intra-modal and inter-modal feature contrastive learning, our DDSS promotes the information consistency of different modalities and eliminates low-quality private features within single modality. Thirdly, our DDSS has a Dual Path Instance Hashing Module to guide hash code representation learning process through instance level and semantic level contrastive learning. The experimental results demonstrated that DDSS outperforms the benchmark methods of cross-modal hashing retrieval field. The experimental source code can be accessed through the following link: <span><span>https://github.com/hcpaper/DDSS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103252"},"PeriodicalIF":14.7,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143899877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No escape: Towards suggestive clues guidance for cross-modality person re-identification 无所遁形:对跨模态人再认同的暗示性线索引导
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-29 DOI: 10.1016/j.inffus.2025.103185
Mingxin Yu , Yiyuan Ge , Zhihao Chen , Rui You , Lianqing Zhu , Mingwei Lin , Zeshui Xu
{"title":"No escape: Towards suggestive clues guidance for cross-modality person re-identification","authors":"Mingxin Yu ,&nbsp;Yiyuan Ge ,&nbsp;Zhihao Chen ,&nbsp;Rui You ,&nbsp;Lianqing Zhu ,&nbsp;Mingwei Lin ,&nbsp;Zeshui Xu","doi":"10.1016/j.inffus.2025.103185","DOIUrl":"10.1016/j.inffus.2025.103185","url":null,"abstract":"<div><div>Criminal activities are frequently committed at night to avoid attention, which seriously challenges traditional re-identification (ReID) systems. Recently, visible–infrared person re-identification (VI-ReID) has been in the spotlight for wide applications in low-light scenes, aiming to match pedestrians across the inherent modality gap between infrared images (for night) and visible images (for daytime). Previous deep learning-based methods mainly bridge the modality gap either by cross-modality translation or learning modality-shared representation. However, the former inevitably damages the original modality information, while the latter ignores fine-grained intrinsic metric relationships between cross-spectral features. In this paper, we propose a suggestive-clues reconfiguration (SCR) framework, which includes representation learning and feature reconfiguration sub-networks. The representation learning is pursued in modality-shared domain, in which we suggest a local cross-alignment (LCA) loss to further optimize the metric between cross-modality clustering components and centers, exploring fine-grained modality-consistent representations. In the feature reconfiguration network, we decouple infrared and visible modality features and introduce reconfiguration encoder to learn identity-related suggestive clues, enhancing the controllability of cross-modality learning. Extensive experiments on SYSU-MM01 and RegDB datasets demonstrate that our SCR is a new state-of-the-art method. Specifically, the Rank-1 and Rank-10 accuracy of SCR are 97.9% and about 100% on the RegDB dataset. Our interesting research highlights the role of suggestive clues in VI-ReID, and our code can be obtained at: <span><span>https://github.com/ISCLab-Bistu/VI-ReID</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103185"},"PeriodicalIF":14.7,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving protein–protein interaction modulator predictions via knowledge-fused language models 通过知识融合语言模型改进蛋白质-蛋白质相互作用调节剂预测
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-26 DOI: 10.1016/j.inffus.2025.103227
Zitong Zhang , Quan Zou , Chunyu Wang , Junjie Wang , Lingling Zhao
{"title":"Improving protein–protein interaction modulator predictions via knowledge-fused language models","authors":"Zitong Zhang ,&nbsp;Quan Zou ,&nbsp;Chunyu Wang ,&nbsp;Junjie Wang ,&nbsp;Lingling Zhao","doi":"10.1016/j.inffus.2025.103227","DOIUrl":"10.1016/j.inffus.2025.103227","url":null,"abstract":"<div><div>Protein-protein interactions (PPIs) play key roles in numerous biological processes and their dysregulation can lead to various human diseases. Modulating these interactions with small molecule PPI modulators has emerged as a promising strategy for treating such diseases. However, current computational approaches for screening PPI modulators often fail to integrate biomolecular expertise and lack the elucidation of interaction mechanisms. Here, we propose a knowledge-fused modulator-PPI interaction prediction method (KFPPIMI) to alleviate these issues. KFPPIMI constructs separate representation models for modulators and proteins, each of which integrates external knowledge from textual and graph-based data sources via a language modeling framework. The fusion of the nuanced expression of natural language with the structural attributes of biomolecules provides KFPPIMI with a holistic view of modulator-PPI interactions. Extensive experiments are conducted to evaluate the effectiveness of KFPPIMI and its individual components. The results show that KFPPIMI outperforms existing methods in different scenarios. Moreover, the modulator and protein representation model can be successfully applied to their respective downstream tasks with comparable performance.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103227"},"PeriodicalIF":14.7,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social network group decision making with minimum cost and maximum satisfaction consensus based on bargaining game 基于议价博弈的最小成本和最大满意度共识的社会网络群体决策
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-25 DOI: 10.1016/j.inffus.2025.103270
Feng Wang , Xiaobing Yu , Yaqi Mao , Witold Pedrycz
{"title":"Social network group decision making with minimum cost and maximum satisfaction consensus based on bargaining game","authors":"Feng Wang ,&nbsp;Xiaobing Yu ,&nbsp;Yaqi Mao ,&nbsp;Witold Pedrycz","doi":"10.1016/j.inffus.2025.103270","DOIUrl":"10.1016/j.inffus.2025.103270","url":null,"abstract":"<div><div>In group decision making (GDM), different decision makers (DMs) will provide different evaluation opinions for alternatives. Consensus-reaching process on these opinions is a critical issue. To improve consensus efficiency, a dynamic social network GDM method based on a bargaining game is developed. First, we build a minimum total cost consensus model for the moderator and then a maximum individual satisfaction consensus model for inconsistent DMs. For the difference in the modified opinions and unit compensation derived from these two types of models, we devise offer-counteroffer strategies for the moderator and DMs under various cases. At the same time, we establish a complete management system for the DM weights based on different behaviors in consensus promotion. In addition, we formulate the trust evolution process of all types of DMs to further update the weights of DMs. Based on this, a consensus feedback iterative mechanism driven by the trust network is constructed. Finally, we use the example of location of an international R&amp;D center to illustrate the entire GDM process. The comparative analysis demonstrates the effectiveness of the proposed method.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103270"},"PeriodicalIF":14.7,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143899878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust privacy-preserving aggregation against poisoning attacks for secure distributed data fusion 面向安全分布式数据融合的抗中毒攻击鲁棒隐私保护聚合
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103223
Chao Huang , Yanqing Yao , Xiaojun Zhang
{"title":"Robust privacy-preserving aggregation against poisoning attacks for secure distributed data fusion","authors":"Chao Huang ,&nbsp;Yanqing Yao ,&nbsp;Xiaojun Zhang","doi":"10.1016/j.inffus.2025.103223","DOIUrl":"10.1016/j.inffus.2025.103223","url":null,"abstract":"<div><div>Privacy-preserving data aggregation could be well applied in federated learning, enabling an aggregator to learn a specified fusion statistics over private data held by clients. Besides, robustness is a critical requirement in federated learning, since a malicious client is able to readily launch poisoning attacks by submitting artificial and malformed model updates to central server. To this end, we present a robust privacy-preserving data aggregation protocol based on distributed trust model, which achieves privacy protection by three-party computation based on replicated secret sharing with honest-majority. The protocol also achieves robustness by securely computing an input validation strategy called norm bounding, including <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span>-norm and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm bounding, which has been proven effective to defend against poisoning attacks. Following the best practice of hybrid protocol design, we exploit both Boolean sharing and arithmetic sharing to efficiently enforce <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span> and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm bounding respectively. Additionally, we propose a novel share conversion protocol converting Boolean shares into arithmetic ones, which is of independent interest and could be used in other protocols. We provide security analysis of the protocol based on standard simulation paradigm and modular composition theorem, reaching the conclusion that presented protocol achieves secure aggregation functionality with norm bounding with computational security in the presence of one static semi-honest server. Comprehensive efficiency analysis and empirical experiments demonstrate its superiority compared with related protocols.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103223"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CLDM-MMNNs: Cross-layer defense mechanisms through multi-modal neural networks fusion for end-to-end cybersecurity—Issues, challenges, and future directions CLDM-MMNNs:基于多模态神经网络融合的端到端网络安全跨层防御机制——问题、挑战和未来方向
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103222
Sijjad Ali , Jia Wang , Victor C.M. Leung , Farhan Bashir , Uzair Aslam Bhatti , Shuaib Ahmed Wadho , Mamoona Humayun
{"title":"CLDM-MMNNs: Cross-layer defense mechanisms through multi-modal neural networks fusion for end-to-end cybersecurity—Issues, challenges, and future directions","authors":"Sijjad Ali ,&nbsp;Jia Wang ,&nbsp;Victor C.M. Leung ,&nbsp;Farhan Bashir ,&nbsp;Uzair Aslam Bhatti ,&nbsp;Shuaib Ahmed Wadho ,&nbsp;Mamoona Humayun","doi":"10.1016/j.inffus.2025.103222","DOIUrl":"10.1016/j.inffus.2025.103222","url":null,"abstract":"<div><div>Cybersecurity threats have grown in complexity and scale, necessitating robust defense mechanisms that integrate multiple layers of network security. Multi-modal neural networks (MMNNs) have emerged as a powerful tool for addressing such challenges due to their ability to process and integrate heterogeneous data sources. This review provides an in-depth analysis of cross-layer defense mechanisms that leverage MMNNs for end-to-end cybersecurity. The study explores the foundational principles of MMNNs, their applications in intrusion detection, malware analysis, anomaly detection, and advanced persistent threat (APT) mitigation. The paper emphasizes the synergy between multi-modal data integration and neural network architectures, enabling real-time threat detection and adaptive response. By categorizing existing approaches and highlighting key advancements, this review outlines current limitations, including computational overhead and model interpretability, and suggests future research directions for developing efficient, scalable, and explainable MMNN-based defense systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103222"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PJPFL: Personalized federated learning with privacy preservation based on sample similarity PJPFL:基于样本相似性的隐私保护的个性化联邦学习
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103221
Hongming Zhang, Qianqian Su
{"title":"PJPFL: Personalized federated learning with privacy preservation based on sample similarity","authors":"Hongming Zhang,&nbsp;Qianqian Su","doi":"10.1016/j.inffus.2025.103221","DOIUrl":"10.1016/j.inffus.2025.103221","url":null,"abstract":"<div><div>Federated learning (FL) is a distributed machine learning paradigm that. However, existing approaches struggle to achieve both privacy protection and effective personalization. Moreover, existing methods they assume users will always adopt personalized updates, overlooking the need for flexible control—allowing users to decide whether to personalize based on their specific requirements. In this paper, we propose PJPFL, a novel personalized federated learning method that enables a flexible trade-off between the global model’s generalization ability and personalized updates derived from local data. By integrating private set intersection (PSI) and Jaccard similarity, PJPFL allows users to customize model updates based on their individual needs while preserving privacy. To further enhance security, we employ homomorphic encryption (HE) to protect model gradients and parameters from inference attacks, a known vulnerability in FL. Experimental results demonstrate that PJPFL significantly improves model adaptability to local data environments, outperforming both FedAvg and FedProx in personalized update scenarios without incurring additional computational or communication overhead.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103221"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ethically Responsible Decision Making for Anomaly Detection in Complex Driving Scenes 复杂驾驶场景异常检测的道德责任决策
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103226
Liming Huang, Yulei Wu
{"title":"Ethically Responsible Decision Making for Anomaly Detection in Complex Driving Scenes","authors":"Liming Huang,&nbsp;Yulei Wu","doi":"10.1016/j.inffus.2025.103226","DOIUrl":"10.1016/j.inffus.2025.103226","url":null,"abstract":"<div><div>The rise of machine and deep learning has revolutionized artificial intelligence (AI) across diverse domains. However, most AI research focuses on optimizing detection accuracy or decision-making precision for specific input data, often overlooking the integration of ethical considerations needed to address the complexities of real-world scenarios. Applications like autonomous driving require not only reliable data processing performance but also strict adherence to ethical principles that align with societal values. This paper introduces an Ethically Responsible Decision-Making (ER-DM) model, wherein ethical principles are mathematically formulated and integrated into the reinforcement learning (RL) framework. To address the challenges in operationalizing abstract ethical principles, we introduce a dual ethical paradigm based on Deontology and Consequentialism, enabling regulatory constraints in state transitions, policy networks and outcome evaluation in reward functions, respectively. Additionally, we propose a novel task, Ethically Responsible Anomaly Detection (ER-AD), which leverages enriched ethical scenario information to classify obstacles into four risk levels based on their ethical abnormality. The ER-DM model is systematically validated in complex driving scenarios through experiments, demonstrating at least a 6% improvement in decision-making accuracy compared to baseline models. Furthermore, by integrating the ER-DM model with deep learning segmentation models, we establish an end-to-end detection system, achieving significant enhancements in image-based anomaly detection tasks.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103226"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement 基于双向知识蒸馏的频谱、波形和SSL表示交叉关注语音增强
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103218
Hang Chen , Chenxi Wang , Qing Wang , Jun Du , Sabato Marco Siniscalchi , Genshun Wan , Jia Pan , Huijun Ding
{"title":"Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement","authors":"Hang Chen ,&nbsp;Chenxi Wang ,&nbsp;Qing Wang ,&nbsp;Jun Du ,&nbsp;Sabato Marco Siniscalchi ,&nbsp;Genshun Wan ,&nbsp;Jia Pan ,&nbsp;Huijun Ding","doi":"10.1016/j.inffus.2025.103218","DOIUrl":"10.1016/j.inffus.2025.103218","url":null,"abstract":"<div><div>We have developed an innovative speech enhancement (SE) model backbone that utilizes cross-attention among spectrum, waveform and self-supervised learned representations (CA-SW-SSL) to integrate knowledge from diverse feature domains. The CA-SW-SSL model integrates the cross spectrum and waveform attention (CSWA) model to connect the spectrum and waveform branches, along with a dual-path cross-attention module to select outputs from different layers of the self-supervised learning (SSL) model. To handle the increased complexity of SSL integration, we introduce a bidirectional knowledge distillation (BiKD) framework for model compression. The proposed adaptive layered distance measure (ALDM) maximizes the Gaussian likelihood between clean and enhanced multi-level SSL features during the backward knowledge distillation (BKD) process. Meanwhile, in the forward process, the CA-SW-SSL model acts as a teacher, using the novel teacher–student Barlow Twins (TSBT) loss to guide the training of the CSWA student models, including both lite and tiny versions. Experiments on the DNS-Challenge and Voicebank+Demand datasets demonstrate that the CSWA-Lite+BiKD model outperforms existing joint spectrum-waveform methods and surpasses the state-of-the-art on the DNS-Challenge non-blind test set with half the computational load. Further, the CA-SW-SSL+BiKD model outperforms all CSWA models and current SSL-based methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103218"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal prototype based multimodal federated learning under severely missing modality 严重模态缺失下基于跨模态原型的多模态联邦学习
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-04-24 DOI: 10.1016/j.inffus.2025.103219
Huy Q. Le , Chu Myaet Thwal , Yu Qiao , Ye Lin Tun , Minh N.H. Nguyen , Eui-Nam Huh , Choong Seon Hong
{"title":"Cross-modal prototype based multimodal federated learning under severely missing modality","authors":"Huy Q. Le ,&nbsp;Chu Myaet Thwal ,&nbsp;Yu Qiao ,&nbsp;Ye Lin Tun ,&nbsp;Minh N.H. Nguyen ,&nbsp;Eui-Nam Huh ,&nbsp;Choong Seon Hong","doi":"10.1016/j.inffus.2025.103219","DOIUrl":"10.1016/j.inffus.2025.103219","url":null,"abstract":"<div><div>Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a global model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The occurrence of missing modalities in real-world applications, such as autonomous driving, often arises from factors like sensor failures, leading knowledge gaps during the training process. Specifically, the absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose <strong>Multimodal Federated Cross Prototype Learning (MFCPL</strong>), a novel approach for MFL under severely missing modalities. Our MFCPL leverages the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing the overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on four multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating the challenges of data heterogeneity and severely missing modalities while improving the overall performance and robustness of MFL.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103219"},"PeriodicalIF":14.7,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信