Information Fusion最新文献

筛选
英文 中文
PNCD: Mitigating LLM hallucinations in noisy environments–A medical case study PNCD:在嘈杂环境中减轻LLM幻觉-一个医学案例研究
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-16 DOI: 10.1016/j.inffus.2025.103328
Jiayi Qu , Jun Liu , Xiangjun Liu , Meihui Chen , Jinchi Li , Jintao Wang
{"title":"PNCD: Mitigating LLM hallucinations in noisy environments–A medical case study","authors":"Jiayi Qu ,&nbsp;Jun Liu ,&nbsp;Xiangjun Liu ,&nbsp;Meihui Chen ,&nbsp;Jinchi Li ,&nbsp;Jintao Wang","doi":"10.1016/j.inffus.2025.103328","DOIUrl":"10.1016/j.inffus.2025.103328","url":null,"abstract":"<div><div>Although large language models (LLMs) have demonstrated impressive reasoning capabilities, the generated responses may contain inaccurate or fictitious information due to noise and redundancy in the data that can interfere with the model's reasoning. Noise is often difficult to avoid in massive data, and manual denoising requires a lot of time, manpower, and material resources. Particularly in the medical and legal domains, specialized textual data requires a greater ability to cope with the hallucinations of LLMs. The ability to maintain the accuracy of information in noisy environments without distorting, modifying, or introducing creative elements is particularly critical. In this paper, we propose an Adaptive Positive and Negative weight Contrast Decoding (PNCD) based on RAG to solve the hallucination of LLMs in noisy contexts environments. Specifically, we construct a set of expert and non-expert LLMs for BaseLLMs: expert LLMs extract information from the set of correct examples, while non-expert LLMs extract information from the set of negative examples that induce hallucinations for BaseLLMs. Their goal is to eliminate noisy information and identify redundant information in the output space of BaseLLMs, to guide BaseLLMs to generate more accurate factual content. We assign enhancement weights to the expert LLMs parameter distributions and penalty weights to the non-expert LLMs parameter distributions, to amplify the prediction effect of expert LLMs suppress the prediction effect of non-expert LLMs, and determine the final correct prediction of the next token. In addition, the KV buffer was introduced to reduce the resource consumption. Experimental results show that PNCD achieves sota results on the medical dataset, with an inference speed of 32 tokens/sec on a single card (RTX 4070TiS). The GPU memory footprint is 0.84G (initially 4.2G). There is also some generalization capability on legal datasets and multiple simultaneous public datasets.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103328"},"PeriodicalIF":14.7,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144106511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal graph representation learning for robust surgical workflow recognition with adversarial feature disentanglement 基于多模态图表示学习的对抗特征解纠缠鲁棒外科工作流程识别
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-16 DOI: 10.1016/j.inffus.2025.103290
Long Bai , Boyi Ma , Ruohan Wang , Guankun Wang , Beilei Cui , Zhongliang Jiang , Mobarakol Islam , Zhe Min , Jiewen Lai , Nassir Navab , Hongliang Ren
{"title":"Multimodal graph representation learning for robust surgical workflow recognition with adversarial feature disentanglement","authors":"Long Bai ,&nbsp;Boyi Ma ,&nbsp;Ruohan Wang ,&nbsp;Guankun Wang ,&nbsp;Beilei Cui ,&nbsp;Zhongliang Jiang ,&nbsp;Mobarakol Islam ,&nbsp;Zhe Min ,&nbsp;Jiewen Lai ,&nbsp;Nassir Navab ,&nbsp;Hongliang Ren","doi":"10.1016/j.inffus.2025.103290","DOIUrl":"10.1016/j.inffus.2025.103290","url":null,"abstract":"<div><div>Surgical workflow recognition is vital for automating tasks, supporting decision-making, and training novice surgeons, ultimately improving patient safety and standardizing procedures. However, data corruption can lead to performance degradation due to issues like occlusion from bleeding or smoke in surgical scenes and problems with data storage and transmission. Therefore, a robust workflow recognition model is urgently needed. In this case, we explore a robust graph-based multimodal approach to integrating vision and kinematic data to enhance accuracy and reliability. Vision data captures dynamic surgical scenes, while kinematic data provides precise movement information, overcoming limitations of visual recognition under adverse conditions. We propose a multimodal Graph Representation network with Adversarial feature Disentanglement (GRAD) for robust surgical workflow recognition in challenging scenarios with domain shifts or corrupted data. Specifically, we introduce a Multimodal Disentanglement Graph Network (MDGNet) that captures fine-grained visual information while explicitly modeling the complex relationships between vision and kinematic embeddings through graph-based message modeling. To align feature spaces across modalities, we propose a Vision-Kinematic Adversarial (VKA) framework that leverages adversarial training to reduce modality gaps and improve feature consistency. Furthermore, we design a Contextual Calibrated Decoder, incorporating temporal and contextual priors to enhance robustness against domain shifts and corrupted data. Extensive comparative and ablation experiments demonstrate the effectiveness of our model and proposed modules. Specifically, we achieved an accuracy of 86.87% and 92.38% on two public datasets, respectively. Moreover, our robustness experiments show that our method effectively handles data corruption during storage and transmission, exhibiting excellent stability and robustness. Our approach aims to advance automated surgical workflow recognition, addressing the complexities and dynamism inherent in surgical procedures.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103290"},"PeriodicalIF":14.7,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144123261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning for modelling unstructured grid data in computational physics: A review 计算物理中非结构化网格数据建模的机器学习:综述
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-15 DOI: 10.1016/j.inffus.2025.103255
Sibo Cheng , Marc Bocquet , Weiping Ding , Tobias Sebastian Finn , Rui Fu , Jinlong Fu , Yike Guo , Eleda Johnson , Siyi Li , Che Liu , Eric Newton Moro , Jie Pan , Matthew Piggott , Cesar Quilodran , Prakhar Sharma , Kun Wang , Dunhui Xiao , Xiao Xue , Yong Zeng , Mingrui Zhang , Rossella Arcucci
{"title":"Machine learning for modelling unstructured grid data in computational physics: A review","authors":"Sibo Cheng ,&nbsp;Marc Bocquet ,&nbsp;Weiping Ding ,&nbsp;Tobias Sebastian Finn ,&nbsp;Rui Fu ,&nbsp;Jinlong Fu ,&nbsp;Yike Guo ,&nbsp;Eleda Johnson ,&nbsp;Siyi Li ,&nbsp;Che Liu ,&nbsp;Eric Newton Moro ,&nbsp;Jie Pan ,&nbsp;Matthew Piggott ,&nbsp;Cesar Quilodran ,&nbsp;Prakhar Sharma ,&nbsp;Kun Wang ,&nbsp;Dunhui Xiao ,&nbsp;Xiao Xue ,&nbsp;Yong Zeng ,&nbsp;Mingrui Zhang ,&nbsp;Rossella Arcucci","doi":"10.1016/j.inffus.2025.103255","DOIUrl":"10.1016/j.inffus.2025.103255","url":null,"abstract":"<div><div>Unstructured grid data are essential for modelling complex geometries and dynamics in computational physics. Yet, their inherent irregularity presents significant challenges for conventional machine learning (ML) techniques. This paper provides a comprehensive review of advanced ML methodologies designed to handle unstructured grid data in high-dimensional dynamical systems. Key approaches discussed include graph neural networks, transformer models with spatial attention mechanisms, interpolation-integrated ML methods, and meshless techniques such as physics-informed neural networks. These methodologies have proven effective across diverse fields, including fluid dynamics and environmental simulations. This review is intended as a guidebook for computational scientists seeking to apply ML approaches to unstructured grid data in their domains, as well as for ML researchers looking to address challenges in computational physics. It places special focus on how ML methods can overcome the inherent limitations of traditional numerical techniques and, conversely, how insights from computational physics can inform ML development. For this purpose, we mainly focus in this review on recent papers from the past decade that reflect strong interactions between computational physics and deep learning methods. To support benchmarking, this review also provides a summary of open-access datasets of unstructured grid data in computational physics. Finally, emerging directions such as generative models with unstructured data, reinforcement learning for mesh generation, and hybrid physics-data-driven paradigms are discussed to inspire future advancements in this evolving field.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103255"},"PeriodicalIF":14.7,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic information guided multimodal skeleton-based action recognition 语义信息引导基于多模态骨架的动作识别
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-15 DOI: 10.1016/j.inffus.2025.103289
Chenghao Li , Wenlong Liang , Fei Yin , Yahui Zhao , Zhenguo Zhang
{"title":"Semantic information guided multimodal skeleton-based action recognition","authors":"Chenghao Li ,&nbsp;Wenlong Liang ,&nbsp;Fei Yin ,&nbsp;Yahui Zhao ,&nbsp;Zhenguo Zhang","doi":"10.1016/j.inffus.2025.103289","DOIUrl":"10.1016/j.inffus.2025.103289","url":null,"abstract":"<div><div>Human skeleton sequences are a crucial data modality for human motion representation. The primary challenge in skeleton-based action recognition lies in the effective capture of spatio-temporal correlations among skeleton joints. However, when the human body interacts with other objects in the background, these spatio-temporal correlations may become less apparent. To tackle this issue, we analyze the semantic information of human actions and propose a Semantic Information Guided Human Skeleton Action Recognition method (ActionGCL), which facilitates the differentiation of skeleton data from different action categories within a latent space. Concretely, we first construct a spatio-temporal action encoder based on graph convolutional neural networks to extract the dependencies among human skeleton sequences. It comprises alternating stacks of modules designed for temporal feature extraction and spatial graph convolution. The temporal feature extraction module integrates multiscale temporal convolutional networks to capture rich inter-frame correlations among nodes, while the spatial graph convolution module adaptively learns a sample-specific topology graph. Subsequently, to leverage the rich semantic information embedded within action labels, we design a multimodal contrastive learning module that simultaneously utilizes both skeleton and textual data. This module optimizes skeleton data in both skeleton-textual directions, employing the abundant semantic information within action labels to guide the training of spatio-temporal action encoders. It facilitates the accurate identification of ambiguous actions that are difficult to discern based solely on spatio-temporal correlations. Experimental results on two prominent action recognition datasets, NTU RGB+D 60 and NTU RGB+D 120, demonstrate that ActionGCL is effective and significantly outperforms other models in recognition accuracy.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103289"},"PeriodicalIF":14.7,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144072125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XSleepFusion: A dual-stage information bottleneck fusion framework for interpretable multimodal sleep analysis XSleepFusion:用于可解释多模态睡眠分析的双阶段信息瓶颈融合框架
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103275
Shuaicong Hu , Yanan Wang , Jian Liu , Cuiwei Yang
{"title":"XSleepFusion: A dual-stage information bottleneck fusion framework for interpretable multimodal sleep analysis","authors":"Shuaicong Hu ,&nbsp;Yanan Wang ,&nbsp;Jian Liu ,&nbsp;Cuiwei Yang","doi":"10.1016/j.inffus.2025.103275","DOIUrl":"10.1016/j.inffus.2025.103275","url":null,"abstract":"<div><div>Sleep disorders affect hundreds of millions globally, with accurate assessment of sleep apnea (SA) and sleep staging (SS) essential for clinical diagnosis and early intervention. Manual analysis by sleep experts is time-consuming and subject to inter-rater variability. Deep learning (DL) approaches offer automation potential but face fundamental challenges in multi-modal physiological signal integration and interpretability. This paper presents XSleepFusion, a cross-modal fusion framework based on information bottleneck (IB) theory for automated sleep analysis. The framework introduces a dual-stage IB mechanism that systematically processes physiological signals: first eliminating intra-modal redundancy, then optimizing cross-modal feature fusion. An evolutionary attention Transformer network (EAT-Net) backbone extracts temporal features at multiple scales, providing interpretable attention patterns. Experimental validation on eight clinical datasets comprising over 15,000 sleep recordings demonstrates the framework’s effectiveness in polysomnogram (PSG)-based SA detection, electrocariogram (ECG)-based SA detection, and SS. The architecture achieves superior generalization across varying signal qualities and modal combinations, while the dual-stage design enables flexible integration of diverse physiological signals. Through interpretable feature representations and robust cross-modal fusion capabilities, XSleepFusion establishes a reliable and adaptable foundation for clinical sleep monitoring.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103275"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GL-BKGNN: Graphlet-based Bi-Kernel Interpretable Graph Neural Networks GL-BKGNN:基于graphlet的双核可解释图神经网络
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103284
Lixiang Xu , Kang Jiang , Xin Niu , Enhong Chen , Bin Luo , Philip S. Yu
{"title":"GL-BKGNN: Graphlet-based Bi-Kernel Interpretable Graph Neural Networks","authors":"Lixiang Xu ,&nbsp;Kang Jiang ,&nbsp;Xin Niu ,&nbsp;Enhong Chen ,&nbsp;Bin Luo ,&nbsp;Philip S. Yu","doi":"10.1016/j.inffus.2025.103284","DOIUrl":"10.1016/j.inffus.2025.103284","url":null,"abstract":"<div><div>While graph neural networks (GNNs) have successfully applied generalized convolution operations to the graph domain, providing a direct method to explain the dependency between the output and the presence of certain features and structural patterns in the input graph remains challenging. Inspired by image filters in standard convolutional neural networks (CNN), we propose a neural framework that connects bi-kernel with GNNs, incorporating predefined rules and focusing on the interpretability of graph filters during training. We address graph kernels based on their differentiability, using backpropagation of differentiable graph kernels for end-to-end training to generate gradient information for model optimization. Simultaneously, we leverage the stronger stability of non-differentiable graph kernels to capture local critical subgraphs, achieving deep fusion of structural features for node updates. Extensive experiments demonstrate that our model achieves competitive performance on graph classification datasets while providing additional benefits in interpretability.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103284"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144099200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An information fusion model of mutual influence between focal elements: A perspective on interference effects in Dempster–Shafer evidence theory 焦点要素间相互影响的信息融合模型:邓普斯特-谢弗证据理论中干扰效应的视角
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103286
Xiaozhuan Gao , Lipeng Pan
{"title":"An information fusion model of mutual influence between focal elements: A perspective on interference effects in Dempster–Shafer evidence theory","authors":"Xiaozhuan Gao ,&nbsp;Lipeng Pan","doi":"10.1016/j.inffus.2025.103286","DOIUrl":"10.1016/j.inffus.2025.103286","url":null,"abstract":"<div><div>Dempster’s rule of combination is a fundamental element of the Dempster–Shafer evidence Theory, which is designed to integrate uncertain information from various independent sources. Its primary goal is to reduce uncertainty and present information of better quality to a decision-making process. Dempster’s rule of combination addresses the conflicts among the pieces of evidence provided by multiple sources. By doing so, the fusion process tends to favor one hypothesis over all others. However, it does not consider the potential interactions between focal elements. This interaction phenomenon is similar to the interference effects observed in quantum theory, where the superposition of different states leads to a redistribution of the state probabilities. In recent years, interference effects have also been studied in various fields, including decision science, quantum machine learning, and autonomous driving. Hence, in this paper, we present a novel interference effects-based combination rule in Dempster–Shafer evidence theory, which accounts for the impact of interference effects arising from potential interactions between focal elements. In proposed method, interference effects in information processing can be attributed to the uncertainty within the mass functions of multiple sources. Therefore, this uncertainty can be utilized to quantify the interference effects. Subsequently, the advantages by considering interference effects in fusion process are detailed and validated through several numerical examples. A similarity analysis and other relevant methods are also conducted to further substantiate these advantages. Finally, we evaluate the performance of new method on real-world classification datasets and compare it with other preprocessing methods in evidence theory. The experimental results of 32 datasets show the superiority of new method in classification accuracy compared to other preprocessing methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103286"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144166767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicitly fusing plug-and-play guidance of source prototype into target subspace for domain adaptation 将源原型的即插即用制导明确地融合到目标子空间中进行域适应
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103197
Hao Luo , Zhiqiang Tian , Panpan Jiao , Meiqin Liu , Shaoyi Du , Kai Nan
{"title":"Explicitly fusing plug-and-play guidance of source prototype into target subspace for domain adaptation","authors":"Hao Luo ,&nbsp;Zhiqiang Tian ,&nbsp;Panpan Jiao ,&nbsp;Meiqin Liu ,&nbsp;Shaoyi Du ,&nbsp;Kai Nan","doi":"10.1016/j.inffus.2025.103197","DOIUrl":"10.1016/j.inffus.2025.103197","url":null,"abstract":"<div><div>The commonly used maximum mean discrepancy (MMD) criterion has two main drawbacks when reducing cross-domain distribution gaps: firstly, it reduces the distribution discrepancy in a global manner, potentially ignoring local structural information between domains, and secondly, its performance heavily relies on the often-unstable pseudo-label refinement process. To solve these problems, we introduce two universal plug-and-play modules: dynamic prototype pursuit (DPP) regularization and bi-branch self-training (BST) mechanism. Firstly, DPP introduces a new inter-class perspective to stabilize MMD by assigning a source prototype to each target sample. This allows us to utilize inter-class data structure information for better alignment. Next, BST is a novel non-parametric pseudo-label refinement mechanism that updates pseudo labels of target data using a classifier trained on the same distribution as the target domain. This avoids the distribution gap issue, making BST more likely to generate accurate target pseudo labels. Importantly, DPP and BST are universal plug-and-play modules for shallow domain adaptation methods. To demonstrate this, experiments of 3 MMD-based models incorporated with DPP and BST are conducted on Office-Caltech, Reuters21578, and Berlin-Emovo-Tess datasets. Experimental results show that these models incorporated with DPP and BST generally achieve better results compared to not using DPP and BST in terms of multiple metrics including accuracy, F1-score, MCC, and false positive rates. Code of 3 different DA methods enhanced by the plug-and-play DPP and BST is available at: <span><span>https://github.com/Evelhz/DPP-and-BST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103197"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning for hyperspectral image classification: A comprehensive review and future predictions 用于高光谱图像分类的深度学习:综合回顾和未来预测
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103285
Yongchao Song , Junhao Zhang , Zhaowei Liu , Yang Xu , Siwen Quan , Lijun Sun , Jiping Bi , Xuan Wang
{"title":"Deep learning for hyperspectral image classification: A comprehensive review and future predictions","authors":"Yongchao Song ,&nbsp;Junhao Zhang ,&nbsp;Zhaowei Liu ,&nbsp;Yang Xu ,&nbsp;Siwen Quan ,&nbsp;Lijun Sun ,&nbsp;Jiping Bi ,&nbsp;Xuan Wang","doi":"10.1016/j.inffus.2025.103285","DOIUrl":"10.1016/j.inffus.2025.103285","url":null,"abstract":"<div><div>Hyperspectral image classification (HSIC) is an important research direction in the field of remote sensing image analysis and computer vision, which is of great practical significance. Hyperspectral imaging (HSI) is widely used in a variety of scenarios with its rich spectral and spatial information, but problems such as high-dimensional data characteristics and scarcity of labeled samples challenge the classification accuracy. Deep learning (DL), with its powerful feature extraction and modeling capabilities, provides an effective means to solve the nonlinear problems in HSIC. In this survey, we systematically review the research progress and applications of DL in HSIC. Firstly, we outline the importance of accurate classification, analyze the features of HSI and the challenges faced by DL in this area. Secondly, we introduce different feature representations of HSI and provide a comprehensive describe of the application of various DL models in HSIC. Meanwhile, we also explore DL methods that can effectively improve the classification performance in the case of insufficient training samples. Finally, we summarize the current research situation, and put forward the future development direction and suggestions.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103285"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HSE: A plug-and-play module for unified fault diagnosis foundation models HSE:用于统一故障诊断基础模型的即插即用模块
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-14 DOI: 10.1016/j.inffus.2025.103277
Qi Li , Bojian Chen , Qitong Chen , Xuan Li , Zhaoye Qin , Fulei Chu
{"title":"HSE: A plug-and-play module for unified fault diagnosis foundation models","authors":"Qi Li ,&nbsp;Bojian Chen ,&nbsp;Qitong Chen ,&nbsp;Xuan Li ,&nbsp;Zhaoye Qin ,&nbsp;Fulei Chu","doi":"10.1016/j.inffus.2025.103277","DOIUrl":"10.1016/j.inffus.2025.103277","url":null,"abstract":"<div><div>Intelligent Fault Diagnosis (IFD) plays a crucial role in industrial applications, where developing foundation models analogous to ChatGPT for comprehensive fault diagnosis remains a significant challenge. Current IFD methodologies are constrained by their inability to construct unified models capable of processing heterogeneous signal types, varying sampling rates, and diverse signal lengths across different equipment. To address these limitations, we propose a novel Heterogeneous Signal Embedding (HSE) module that projects heterogeneous signals into a unified signal space, offering seamless integration with existing IFD architectures as a plug-and-play solution. The HSE framework comprises two primary components: the Temporal-Aware Patching (TAP) module for embedding heterogeneous signals into a unified space, and the Cross-Dimensional Patch Fusion (CDPF) module for fusing embedded signals with temporal information into unified representations. We validate the efficacy of HSE through two comprehensive case studies: a simulation signal dataset and three distinct bearing datasets with heterogeneous features. Our experimental results demonstrate that HSE significantly enhances traditional fault diagnosis models, improving both diagnostic accuracy and generalization capability. While conventional approaches necessitate separate models for specific signal types, sampling frequencies, and signal lengths, HSE-enabled architectures successfully learn unified representations across diverse signal. The results from bearing fault diagnosis applications confirm substantial improvements in both diagnostic precision and cross-dataset generalization. As a pioneering contribution toward IFD foundation models, the proposed HSE framework establishes a fundamental architecture for advancing unified fault diagnosis systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103277"},"PeriodicalIF":14.7,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144099099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信