Information Fusion最新文献

筛选
英文 中文
A multimodal information-interconnected network for medication guidance in HR+/HER2- breast cancer treatment HR+/HER2-乳腺癌治疗用药指导的多模式信息互联网络
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-28 DOI: 10.1016/j.inffus.2025.103326
Jinlin Ye , Yuhan Liu , Shangjie Ren , Changjun Wang , Yidong Zhou , Liang Yang , Wei Zhang
{"title":"A multimodal information-interconnected network for medication guidance in HR+/HER2- breast cancer treatment","authors":"Jinlin Ye ,&nbsp;Yuhan Liu ,&nbsp;Shangjie Ren ,&nbsp;Changjun Wang ,&nbsp;Yidong Zhou ,&nbsp;Liang Yang ,&nbsp;Wei Zhang","doi":"10.1016/j.inffus.2025.103326","DOIUrl":"10.1016/j.inffus.2025.103326","url":null,"abstract":"<div><div>Predicting lymph node metastases counts is crucial for determining appropriate drug therapy in hormone receptor-positive (HR+) and human epidermal growth factor receptor 2-negative (HER2-) breast cancer patients. Current clinical lymph node staging primarily depends on axillary surgical procedures. Early-stage breast cancer patients with negative axillary ultrasound findings may be exempt from axillary surgery, resulting in the absence of definitive lymph node staging information. Existing artificial intelligence methods typically utilize clinical multimodal data for making these predictions. However, the inherent heterogeneity among modalities limits the ability of models to fully explore and establish complex inter-modal relationships, thereby restricting their representational and predictive capabilities. Therefore, this paper proposes a multimodal information interconnected neural network (MIINet) for lymph node metastasis number (LNM) prediction, thereby guiding drug therapy decisions for HR+/HER2- breast cancer patients. MIINet involves three key innovations: the Clinical Clustering Graph Encoder (CCGE), which effectively models complex intra-cluster relationships in clinical data; the Potential Association State (PAS), which captures implicit inter-modal correlations through hierarchical feature extraction and fusion; and Feature Categorization and Reorder (FCR), which enhances feature diversity and inter-modal interactions. Experiments were conducted on both single-center and multi-center datasets. The results of 10-fold cross-validation show that MIINet achieves an accuracy of 0.8220 ± 0.0602 and 0.7572 ± 0.0270, F1-score of 0.8178 ± 0.0588 and 0.7518 ± 0.0324, AUC of 0.8979 ± 0.0479 and 0.8326 ± 0.0350, specificity of 0.9204 ± 0.0237 and 0.8864 ± 0.0217, FNR of 0.2400 ± 0.1045 and 0.3157 ± 0.0929, and MCC of 0.7370 ± 0.0914 and 0.6445 ± 0.0383. The results of comparative experiments indicate that the proposed MIINet outperforms other reported multimodal fusion methods in the Abemaciclib drug guidance task. The source code is available at <span><span>https://github.com/JinlinYY/MIINet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103326"},"PeriodicalIF":14.7,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144194571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A knowledge-informed dynamic correlation modeling framework for lane-level traffic flow prediction 基于知识的车道级交通流预测动态关联建模框架
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-28 DOI: 10.1016/j.inffus.2025.103327
Ruiyuan Jiang , Shangbo Wang , Wei Ma , Yuli Zhang , Pengfei Fan , Dongyao Jia
{"title":"A knowledge-informed dynamic correlation modeling framework for lane-level traffic flow prediction","authors":"Ruiyuan Jiang ,&nbsp;Shangbo Wang ,&nbsp;Wei Ma ,&nbsp;Yuli Zhang ,&nbsp;Pengfei Fan ,&nbsp;Dongyao Jia","doi":"10.1016/j.inffus.2025.103327","DOIUrl":"10.1016/j.inffus.2025.103327","url":null,"abstract":"<div><div>Lane-level traffic prediction forecasts near-future conditions at specific lane segments, enabling real-time traffic management and particularly aiding autonomous vehicles (AVs) in precise tasks such as car-following and lane changes. Despite substantial advancements in this field, some key challenges remain. First, the traffic state of a lane segment exhibits dynamic, nonlinear spatial correlation with other segments, making accurate modeling complex in real-world environments. Second, existing deep learning models depend heavily on specific datasets, leading to poor generalization. Third, while recent studies have shown that Large Language Models (LLMs) exhibit superior performance in generating reliable traffic prediction results, their direct application is hindered by inefficiency, high computational costs, and difficulties in capturing dynamic traffic features. To address these challenges, we propose the Knowledge-informed Dynamic Correlation Modeling (KIDCM) framework, which integrates pre-trained LLMs with traditional predictive methodologies to achieve a balance between generalization and prediction accuracy. Specifically, we introduce a General Spatial Dynamics Modeling (GSDM) method, which leverages the unbiased traffic data generated by LLM to analyze the general law dynamic spatial correlations. By integrating traditional time-series models with attention mechanisms, GSDM effectively models both linear temporal dependencies and nonlinear spatial interactions, ensuring robust generalization across varying conditions. Additionally, we develop a surrogate model that distills the traffic prediction function of LLMs. This surrogate model can be fine-tuned with small sample sizes, preserving the generalization advantages of LLMs while mitigating their typically high resource demands. Extensive evaluations demonstrate that our framework outperforms state-of-the-art models in terms of generalization, small-sample training, and computational cost.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103327"},"PeriodicalIF":14.7,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144166606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixed-noise robust tensor multi-view clustering via adaptive dictionary learning 基于自适应字典学习的混合噪声鲁棒张量多视图聚类
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-27 DOI: 10.1016/j.inffus.2025.103322
Jing-Hua Yang , Yi Zhou , Lefei Zhang , Heng-Chao Li
{"title":"Mixed-noise robust tensor multi-view clustering via adaptive dictionary learning","authors":"Jing-Hua Yang ,&nbsp;Yi Zhou ,&nbsp;Lefei Zhang ,&nbsp;Heng-Chao Li","doi":"10.1016/j.inffus.2025.103322","DOIUrl":"10.1016/j.inffus.2025.103322","url":null,"abstract":"<div><div>Multi-view clustering (MVC) has received extensive attention by exploiting the consistent and complementary information among views. To improve the robustness of MVC, most MVC methods assume that the noise implicit in the data follows a predefined distribution. However, due to equipment limitations and transmission environment, the collected multi-view data often contains mixed noise. The predefined distribution assumption may not be able to effectively suppress complex mixed noise, resulting in a decrease in clustering performance. For solving the above problem, we propose a novel mixed-noise robust tensor multi-view clustering method (MRTMC) via adaptive dictionary learning. To accurately characterize the mixed noise, we consider mixed noise as a combination of structural noise and Gaussian noise and characterize both respectively. Specially, we design adaptive dictionary learning to accurately model structural noise containing semantic information and use Frobenius norm to constrain Gaussian noise. To fully mine the consistency among multiple views, we introduce a nonconvex tensor nuclear norm on the self-representation tensor to explore the high-order correlation among multiple views. Moreover, the weight of each view is learned through an adaptive weighting strategy. For solving the model, we develop an effective algorithm based on the alternating direction method of multipliers (ADMM) framework and provide the convergence guarantee of the algorithm under mild conditions. Extensive experimental results on simulated and real-world datasets indicate the clustering performance of the proposed MRTMC method is superior to the compared methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103322"},"PeriodicalIF":14.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144166634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neurostressology: A systematic review of EEG-based automated mental stress perspectives 神经压力学:基于脑电图的自动精神压力观点的系统回顾
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-27 DOI: 10.1016/j.inffus.2025.103368
Sayantan Acharya , Abbas Khosravi , Douglas Creighton , Roohallah Alizadehsani , U Rajendra Acharya
{"title":"Neurostressology: A systematic review of EEG-based automated mental stress perspectives","authors":"Sayantan Acharya ,&nbsp;Abbas Khosravi ,&nbsp;Douglas Creighton ,&nbsp;Roohallah Alizadehsani ,&nbsp;U Rajendra Acharya","doi":"10.1016/j.inffus.2025.103368","DOIUrl":"10.1016/j.inffus.2025.103368","url":null,"abstract":"<div><div>Presently, mental stress is a significant contributor to physical and psychological health issues, making its early detection and monitoring a public health priority. Among various neuroimaging methods, Electroencephalography (EEG) has emerged as a promising tool due to its ability to capture fine-grained temporal dynamics associated with cognitive stress responses. This paper presents a systematic review of 275 peer-reviewed studies published between 2003 and January 2025, focused on EEG-based mental stress quantification. While previous stress reviews primarily emphasized signal processing pipelines and psychological stress methods, this study emphasizes the potential of fusion-centric approaches. It systematically analyzes and compares studies across multiple dimensions, including EEG datasets, stressor categories, key electrodes, brain regions, feature correlations, and classifier performance, to identify methodological trends, inconsistencies, and gaps in standardization. The review underlines multiple levels of fusion, including multimodal fusion such as EEG with speech-based features, algorithmic fusion, and fusion of transfer learning and feature extraction using multimodal foundation models. It also examines key challenges in reproducibility, dataset availability, differences in brain region selection, experiment duration, and EEG processing approaches. Among classifiers, SVM, random forest, and decision tree are identified as the most effective AI methods for stress classification, with CNN and LSTM showing superior performance in capturing spatiotemporal patterns. This review concludes by highlighting the importance of fusing cortical activation patterns with EEG-based connectivity measures and deep learning techniques to enhance the accuracy of mental stress detection.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103368"},"PeriodicalIF":14.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel graph model for resolving power-asymmetric conflicts: Application in hierarchical diagnosis and treatment systems 一种解决权力不对称冲突的新图模型:在分层诊疗系统中的应用
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-27 DOI: 10.1016/j.inffus.2025.103310
Guolin Tang , Tangzhu Zhang , Francisco Chiclana , Peide Liu
{"title":"A novel graph model for resolving power-asymmetric conflicts: Application in hierarchical diagnosis and treatment systems","authors":"Guolin Tang ,&nbsp;Tangzhu Zhang ,&nbsp;Francisco Chiclana ,&nbsp;Peide Liu","doi":"10.1016/j.inffus.2025.103310","DOIUrl":"10.1016/j.inffus.2025.103310","url":null,"abstract":"<div><div>As Chinese society undergoes rapid aging and urbanization, the existing medical service system faces significant challenges, including unequal resource distribution, a shortage of high-quality resources, and inefficient allocation. To address these issues, the hierarchical diagnosis and treatment system (HDTS) has been introduced to optimize medical resource allocation and utilization. However, implementing HDTS encounters complex conflicts of interest among multiple decision-makers (DMs), compounded by ambiguity, uncertainty, and power asymmetry. This paper proposes the power-asymmetric additive graph model for conflict resolution (PAAGMCR), a versatile tool that integrates qualitative and quantitative methods to address stakeholder conflict in HDTS implementation in Shandong, China. The optimal solution <span><math><msub><mrow><mi>s</mi></mrow><mrow><mn>18</mn></mrow></msub></math></span> can be identified using PAAGMCR: the Shandong Provincial Government should standardize medical treatment processes in tertiary hospitals, invest in grassroots medical facilities, allocate funds for public awareness campaigns, and encourage patients to seek initial treatment at the grassroots level. Tertiary hospitals should collaborate with grassroots hospitals to utilize subsidies for equipment upgrades and workforce training. Patients and their families should adhere to HDTS principles and make informed healthcare decisions. Furthermore, this study outlines an evolutionary path from the initial to the optimal state, offering theoretical support for resolving real-world conflicts. Finally, strategic recommendations are provided according to the analysis result of the conflict to guide DMs in implementing HDTS effectively.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103310"},"PeriodicalIF":14.7,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144166635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SNAFusion-MM: Distilling sparse sampled measurements by 2D axial diffusion priors with multi-step matching for 3D inverse problem snafusin - mm:利用二维轴向扩散先验提取稀疏采样测量值,并进行多步匹配,用于三维反问题
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-26 DOI: 10.1016/j.inffus.2025.103323
Xiaoyue Li , Tielong Cai , Jun Dan , Sizhao Ma , Kai Shang , Mark D. Butala , Gaoang Wang
{"title":"SNAFusion-MM: Distilling sparse sampled measurements by 2D axial diffusion priors with multi-step matching for 3D inverse problem","authors":"Xiaoyue Li ,&nbsp;Tielong Cai ,&nbsp;Jun Dan ,&nbsp;Sizhao Ma ,&nbsp;Kai Shang ,&nbsp;Mark D. Butala ,&nbsp;Gaoang Wang","doi":"10.1016/j.inffus.2025.103323","DOIUrl":"10.1016/j.inffus.2025.103323","url":null,"abstract":"<div><div>Reconstructing 3D volumes with inner details from sparse measurements remains a critical challenge in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). Existing data-driven 3D decoders suffer from limited generalizability and recent diffusion models most remain restricted to 2D domains due to prohibitive memory demands that hinder 3D utilization. Although implicit neural rendering (INR)s develop 3D representations, they frequently struggle to maintain reconstruction fidelity under extremely sparse view conditions. We propose SNAFusion-MM,a framework that unifies 2D axial diffusion priors, geometric constraint from physical operators, and multi-step distillation strategy for Sparse measured 3D medical reconstruction. Unlike conventional score distillation sampling, ourframework distills robust prior knowledge from pre-trained 2D prior within deterministic DDIM trajectories and incorporates plug-and-play geometric information related to the measured process to refine the global coherent 3Dneural radiance field, eliminating the over-smoothing artifacts of single-step SDS while preserving 3D consistency and anatomical details. We conducted experiments on challenging in-/out-of-distribution datasets under a single GPU without any retraining. Quantitative and qualitative assessments demonstrate that SNAFusion-MM outperforms the recent works and also exhibit its superior generalizability, especially including extremely sparse-view cone-beam CT (CBCT), X-ray novel-view synthesis (NVS) from sparse sampled CBCT, and radial sampled compressed sensing MRI (CS-MRI) tasks, which cannot yet be well handled by state-of-the-art (SOTA)s.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103323"},"PeriodicalIF":14.7,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144194616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-rank tucker decomposition for multi-view outlier detection based on meta-learning 基于元学习的多视图异常点检测的低秩tucker分解
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-23 DOI: 10.1016/j.inffus.2025.103313
Wei Lin , Kun Xie , Jiayin Li , Shiping Wang , Li Xu
{"title":"Low-rank tucker decomposition for multi-view outlier detection based on meta-learning","authors":"Wei Lin ,&nbsp;Kun Xie ,&nbsp;Jiayin Li ,&nbsp;Shiping Wang ,&nbsp;Li Xu","doi":"10.1016/j.inffus.2025.103313","DOIUrl":"10.1016/j.inffus.2025.103313","url":null,"abstract":"<div><div>The analysis and mining of multi-view data have gained widespread attention, making multi-view anomaly detection a prominent research area. Despite notable advancements in the performance of existing multi-view anomaly detection methods, they still face certain limitations. (1) The existing methods fail to fully leverage the low-rank structure of multi-view data, which results in a lack of necessary interpretability when uncovering the latent relationships between views. (2) In the recovery of the consensus structure, the current methods achieve this merely through a simple aggregation process, lacking in-depth exploration and interaction between the potential structures of each view. To address these challenges, we propose the <u>L</u>ow-<u>R</u>ank <u>T</u>ucker <u>D</u>ecomposition based on <u>M</u>eta-Learning (LRTDM) for multi-view outlier detection. First, the low-rank Tucker decomposition is employed to reveal the low-rank structure of the multi-view self-expressive tensor. The factor matrices and core tensor effectively preserve and encode the latent structure of each view. This structured representation can efficiently capture the potential shared features between views, allowing for a more refined analysis of each individual view. Furthermore, meta-learning is utilized to define the learning and fusion of view-specific latent features as a nested optimization problem, which is solved alternately using a two-layer optimization scheme. Finally, anomalies are detected through the consensus matrix recovered from the latent representations and the error matrix during the self-expressive tensor learning process. Extensive experiments conducted on five publicly available datasets demonstrate the effectiveness of our approach. The results show that our algorithm improves detection accuracy by 2% to 10% compared to state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103313"},"PeriodicalIF":14.7,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CertainTTA: Estimating uncertainty for test-time adaptation on medical image segmentation 医学图像分割测试时间自适应的不确定度估计
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-22 DOI: 10.1016/j.inffus.2025.103300
Xingbo Dong , Liwen Wang , Xingguo Lv , Xiaoyan Zhang , Hui Zhang , Bin Pu , Zhan Gao , Iman Yi Liao , Zhe Jin
{"title":"CertainTTA: Estimating uncertainty for test-time adaptation on medical image segmentation","authors":"Xingbo Dong ,&nbsp;Liwen Wang ,&nbsp;Xingguo Lv ,&nbsp;Xiaoyan Zhang ,&nbsp;Hui Zhang ,&nbsp;Bin Pu ,&nbsp;Zhan Gao ,&nbsp;Iman Yi Liao ,&nbsp;Zhe Jin","doi":"10.1016/j.inffus.2025.103300","DOIUrl":"10.1016/j.inffus.2025.103300","url":null,"abstract":"<div><div>Cross-site distribution shift in medical images is a major factor causing model performance degradation, significantly challenging the deployment of pre-trained semantic segmentation models for clinical adoption. In this paper, we propose a novel framework, CertainTTA, to maximally exploit a pretrained model for test time adaptation. Firstly, we leverage variational inference and innovatively construct a probabilistic source model by incorporating Gaussian priors on the network parameters of the pre-trained source model. A predictive posterior distribution is computed at test time for the target image, which is then used to estimate the uncertainty of the target prediction based on entropy measure. In the meantime, a novel adaptive score is also constructed to measure the source model uncertainty on its adaptability for a target image based on the mutual information between the target prediction and the target input. Both output uncertainty and model uncertainty are incorporated at test time, where the former is minimized against a low-frequency prompt which optimally reduces the domain shift at image level, and the latter is used to select the target prediction with the best model adaptability during the prompt optimization process. CertainTTA overcomes the weakness of existing entropy minimization methods where the latter becomes unreliable under biased target scenarios and tends to yield overconfident predictions. To the best of our knowledge, CertainTTA also serves as the first solution to trace model adaptability in a CTTA setting. We conduct TTA and CTTA experiments on three medical semantic segmentation benchmarks, achieving average <span><math><mrow><mi>D</mi><mi>S</mi><mi>C</mi></mrow></math></span> improvements of 2.94%, 4.06%, and 3.49% under the TTA scenario over the state-of-the-art method on the OD/OC, polyp, and MRI Prostate segmentation datasets, respectively.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103300"},"PeriodicalIF":14.7,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144116562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-source transformer-based neighborhood contrastive learning for joint classification of hyperspectral and LiDAR Data 基于跨源变压器的高光谱与激光雷达数据联合分类邻域对比学习
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-22 DOI: 10.1016/j.inffus.2025.103225
Shuai Liu , Tong Yu , Jun Zhou , Guanglong Xing , Huanfa Chen
{"title":"Cross-source transformer-based neighborhood contrastive learning for joint classification of hyperspectral and LiDAR Data","authors":"Shuai Liu ,&nbsp;Tong Yu ,&nbsp;Jun Zhou ,&nbsp;Guanglong Xing ,&nbsp;Huanfa Chen","doi":"10.1016/j.inffus.2025.103225","DOIUrl":"10.1016/j.inffus.2025.103225","url":null,"abstract":"<div><div>The fusion of the hyperspectral image (HSI) and light detection and ranging (LiDAR) data has demonstrated significant potential in the land cover classification task. Although deep learning has shown remarkable success in the joint classification of HSI and LiDAR data, the large amount of unlabeled multisource remote sensing data is not fully utilized. Additionally, effectively integrating HSI and LiDAR data remains a challenging task, and the semantic relationship of neighborhood regions needs to be further exploited. In this paper, we propose a cross-source transformer-based neighborhood contrastive learning model (CTNCLM), which acquires a more discriminative feature representation from unlabeled data through the pre-training stage. Considering the semantic correlation between neighboring image patches, CTMCLM achieves the joint classification of HSI and LiDAR data at a more precise level. A cross-patch contrastive learning (CPCL) module is proposed to calculate the similarity between original patches and neighborhood patches. Furthermore, a cross-source transformer (CST) with cross-source attention is proposed to fuse the multi-source data, which exploits the intermodal information interaction between the HSI and LiDAR data. Extensive experiments on three public datasets demonstrate the superior classification performance of the proposed method compared with several state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103225"},"PeriodicalIF":14.7,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context-driven and sparse decoding for Remote Sensing Visual Grounding 上下文驱动和稀疏解码的遥感视觉接地
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-05-22 DOI: 10.1016/j.inffus.2025.103296
Yichen Zhao , Yaxiong Chen , Ruilin Yao , Shengwu Xiong , Xiaoqiang Lu
{"title":"Context-driven and sparse decoding for Remote Sensing Visual Grounding","authors":"Yichen Zhao ,&nbsp;Yaxiong Chen ,&nbsp;Ruilin Yao ,&nbsp;Shengwu Xiong ,&nbsp;Xiaoqiang Lu","doi":"10.1016/j.inffus.2025.103296","DOIUrl":"10.1016/j.inffus.2025.103296","url":null,"abstract":"<div><div>Remote Sensing Visual Grounding (RSVG) is an emerging multimodal RS task that involves grounding textual descriptions to specific objects in remote sensing images. Previous methods often overlook the impact of complex backgrounds and similar geographic entities during feature extraction, which may confuse target features and cause performance bottlenecks. Moreover, remote sensing scenes include extensive surface information, much of which contributes little to the reasoning of the target object. This redundancy not only increases the computational burden but also impairs decoding efficiency. To this end, we propose the Context-driven Sparse Decoding Network (CSDNet) for accurate grounding through multimodal context-aware feature extraction and text-guided sparse reasoning. To alleviate target feature confusion, a Text-aware Fusion Module (TFM) is introduced to refine the visual features using textual cues related to the image context. In addition, a Context-enhanced Interaction Module (CIM) is proposed to harmonize the differences between remote sensing images and text by modeling multimodal contexts. To tackle surface information redundancy, a Text-guided Sparse Decoder (TSD) is developed, which decouples image resolution from reasoning complexity by performing sparse sampling under text guidance. Extensive experiments on DIOR-RSVG, OPT-RSVG, and VRSBench benchmarks demonstrate the effectiveness of CSDNet. Remarkably, CSDNet utilizes only 5.12% of the visual features in performing cross-modal reasoning about the target object. The code is available at <span><span>https://github.com/WUTCM-Lab/CSDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103296"},"PeriodicalIF":14.7,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144166560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信