Information FusionPub Date : 2025-09-29DOI: 10.1016/j.inffus.2025.103776
Wen Wen , Tieliang Gong , Yuxin Dong , Shujian Yu , Bo Dong
{"title":"Towards the generalization of multi-view learning: An information-theoretical analysis","authors":"Wen Wen , Tieliang Gong , Yuxin Dong , Shujian Yu , Bo Dong","doi":"10.1016/j.inffus.2025.103776","DOIUrl":"10.1016/j.inffus.2025.103776","url":null,"abstract":"<div><div>Multiview learning has drawn widespread attention for its efficacy in leveraging cross-view consensus and complementarity information to achieve a comprehensive representation of data. While multi-view learning has undergone vigorous development and achieved remarkable success, the theoretical understanding of its generalization behavior remains elusive. This paper aims to bridge this gap by developing information-theoretic generalization bounds for multi-view learning, with a particular focus on multi-view reconstruction and classification tasks. Our bounds underscore the importance of capturing both consensus and complementary information from multiple different views to achieve maximally disentangled representations. These results also indicate that applying the multi-view information bottleneck regularizer is beneficial for satisfactory generalization performance. Additionally, we derive novel data-dependent bounds under both leave-one-out and supersample settings, yielding computationally tractable and tighter bounds. In the interpolating regime, we further establish the fast-rate bound for multi-view learning, exhibiting a faster convergence rate compared to conventional square-root bounds. Numerical results indicate a strong correlation between the true generalization gap and the derived bounds.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103776"},"PeriodicalIF":15.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-29DOI: 10.1016/j.inffus.2025.103791
Xiaohua Wu , Xiaohui Tao , Wenjie Wu , Jianwei Zhang , Yuefeng Li , Lin Li
{"title":"Random forest of thoughts: Reasoning path fusion for LLM inference in computational social science","authors":"Xiaohua Wu , Xiaohui Tao , Wenjie Wu , Jianwei Zhang , Yuefeng Li , Lin Li","doi":"10.1016/j.inffus.2025.103791","DOIUrl":"10.1016/j.inffus.2025.103791","url":null,"abstract":"<div><div>Large language models (LLMs) have demonstrated significant promise for reasoning problems. They are among the leading techniques for context inference, particularly in scenarios with strong sequential dependencies, where earlier inputs dynamically influence subsequent responses. However, existing reasoning paradigms such as X-of-thoughts (XoT) typically rely on unidirectional, left-to-right inference with limited inference paths. This renders them ineffective in handling inherent skip logic and multi-path reasoning, especially for contexts such as a multi-turn social survey. To address this, we propose Random Forest of Thoughts (RFoT), a novel prompting framework grounded in the principles of reasoning path fusion for skip logic. It uses Iterative Chain-of-Thought (ICoT) prompting to generate a diverse set of reasoning thoughts. These thoughts are then assessed using a cooperative contribution evaluator to estimate their contribution. By randomly sampling and fusing the top-<span><math><mi>k</mi></math></span> reasoning thoughts, RFoT simulates uncertain skip logic and constructs a rich forest of plausible thoughts. This enables it to achieve robust multi-path reasoning, where each question sequence formed by the skip logic is treated as an independent reasoning path. RFoT is validated on two classic social problems featuring strong skip logic, using three open-source LLMs and five datasets that have been categorized as structured social surveys and public social media data. Experimental results demonstrate that RFoT significantly enhances inference performance on problems that require complex, non-linear reasoning across both survey and social media data. The transparency and trustworthiness of the results stem from the interpretable fusion of diverse reasoning paths and the principled integration of cooperative evaluation mechanisms.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103791"},"PeriodicalIF":15.5,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-28DOI: 10.1016/j.inffus.2025.103782
Bocheng Zhao , Wenxing Zhang , Lei Bao , Wucheng Wang , Zhenyu Kong , Qiguang Miao
{"title":"CMF: Prediction refinement via complementary manifold-based multi-model fusion","authors":"Bocheng Zhao , Wenxing Zhang , Lei Bao , Wucheng Wang , Zhenyu Kong , Qiguang Miao","doi":"10.1016/j.inffus.2025.103782","DOIUrl":"10.1016/j.inffus.2025.103782","url":null,"abstract":"<div><div>In current research on multi-model fusion, mainstream approaches predominantly focus on the design of fusion algorithms, while often overlooking the filtering or selection of outputs from individual base models prior to fusion. Moreover, most existing fusion methods exhibit a high degree of coupling, which limits their flexibility and adaptability in cross-scene applications. Consequently, once the fusion is completed, the model architecture tends to become fixed, making it difficult to integrate new models or replace outdated components. To address these limitations and achieve effective state-of-the-art (SOTA) breakthroughs in diverse single-label image classification tasks-such as fine-grained recognition or long-tailed distributions-without being constrained by model architecture, this paper proposes a highly generalizable multi-model complementary method. The proposed approach is applicable to single-label multi-class classification tasks in any deep learning domain and has achieved global SOTA performance on multiple image classification benchmarks. It imposes no restrictions on the architecture, parameter settings, or training strategies of the base models, enabling direct integration of existing SOTA models. Furthermore, the fusion process is fully decoupled, ensuring that the independent training of each base model remains unaffected and preserving the inherent advantages of their original training paradigms.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103782"},"PeriodicalIF":15.5,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-27DOI: 10.1016/j.inffus.2025.103775
Dan Liu , Zhouli Shen , Ai Peng , Zhiyuan Ma , Jinpeng Mi , Mao Ye , Jianwei Zhang
{"title":"JSS-CLIP: Boosting image-to-video transfer learning with JigSaw side network","authors":"Dan Liu , Zhouli Shen , Ai Peng , Zhiyuan Ma , Jinpeng Mi , Mao Ye , Jianwei Zhang","doi":"10.1016/j.inffus.2025.103775","DOIUrl":"10.1016/j.inffus.2025.103775","url":null,"abstract":"<div><div>Large pre-trained vision-language models, such as CLIP, have achieved remarkable success in computer vision. However, the challenge of extending image-based models to video understanding through effective temporal modeling remains an open problem. Although recent studies have shifted their focus towards image-to-video transfer learning, the majority of existing methods overlook algorithm efficiency when adapting large models to the video domain. In this paper, we propose an innovative JigSaw Side network, JSS-CLIP, aiming to balance the algorithm efficiency and spatiotemporal modeling performance for video action recognition. Specifically, we introduce lightweight side networks attached to the frozen vision model, which avoids the backpropagation through the computationally intensive pre-trained model, thereby significantly reducing computational costs. Additionally, we design an implicit alignment module to guide the generation of hierarchical spatiotemporal JigSaw feature maps. These feature maps encapsulate rich motion information and action cues within videos, facilitating a comprehensive understanding of dynamic content. We conduct extensive experiments on three large-scale action datasets, whose results consistently demonstrate the competitiveness of JSS-CLIP in terms of efficiency and performance. The source code will be released at https://github.com/liarshen/JSS-CLIP.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103775"},"PeriodicalIF":15.5,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-27DOI: 10.1016/j.inffus.2025.103777
Marco Mari, Lauro Snidaro
{"title":"Ensemble of KalmanNets with innovation-based attention for robust target tracking","authors":"Marco Mari, Lauro Snidaro","doi":"10.1016/j.inffus.2025.103777","DOIUrl":"10.1016/j.inffus.2025.103777","url":null,"abstract":"<div><div>Model-based tracking algorithms often suffer from significant performance degradation when tracking maneuvering targets, primarily due to inherent uncertainties in target dynamics. To address this limitation, we propose a novel ensemble-based approach that integrates multiple neural-aided Kalman filters, referred to as KalmanNet, within a multiple-model framework, inspired by traditional interacting multiple-model (IMM) filtering techniques. Each KalmanNet instance is specialized in tracking targets governed by a distinct motion model. The ensemble fuses their state estimates using a Recurrent Neural Network (RNN), which learns to adaptively weigh and combine the predictions based on the underlying target dynamics. This fusion mechanism enables the system to model complex motion patterns more effectively and achieves lower estimation bias and variance compared to relying on a single KalmanNet when tracking maneuvering targets, as demonstrated through extensive simulation experiments. Furthermore, we introduce an explainable, innovation-based attention mechanism to enhance the interpretability of our results, inspired by traditional model-based tracking algorithms, that aids the identification of target motion dynamics. Our findings indicate that this attention mechanism improves robustness to sensor noise, out-of-distribution data, and missing measurements. Overall, this innovative approach has the potential to advance state-of-the-art target tracking applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103777"},"PeriodicalIF":15.5,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-27DOI: 10.1016/j.inffus.2025.103784
Wooyoung Kim, Wooju Kim
{"title":"Addressing information bottlenecks in graph augmented large language models via graph neural summarization","authors":"Wooyoung Kim, Wooju Kim","doi":"10.1016/j.inffus.2025.103784","DOIUrl":"10.1016/j.inffus.2025.103784","url":null,"abstract":"<div><div>This study investigates the problem of information bottlenecks in graph-level prompting, where compressing all node embeddings into a single vector leads to significant structural information loss. We clarify and systematically analyze this challenge, and propose the Graph Neural Summarizer (GNS), a continuous prompting framework that generates multiple query-aware prompt vectors to better preserve graph structure and improve context relevance. Experiments on ExplaGraphs, SceneGraphs, and WebQSP show that GNS consistently improves performance over strong graph-level prompting baselines. These findings emphasize the importance of addressing information bottlenecks when integrating graph-structured data with large language models. Implementation details and source code are publicly available at <span><span>https://github.com/timothy-coshin/GraphNeuralSummarizer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103784"},"PeriodicalIF":15.5,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-26DOI: 10.1016/j.inffus.2025.103770
Taiqin Chen , Hao Sha , Yifeng Wang , Yuan Jiang , Shuai Liu , Zikun Zhou , Ke Chen , Yongbing Zhang
{"title":"A channel- adaptive and plug-and- play framework for hyperspectral image analysis","authors":"Taiqin Chen , Hao Sha , Yifeng Wang , Yuan Jiang , Shuai Liu , Zikun Zhou , Ke Chen , Yongbing Zhang","doi":"10.1016/j.inffus.2025.103770","DOIUrl":"10.1016/j.inffus.2025.103770","url":null,"abstract":"<div><div>HyperSpectral Image (HSI) reflects rich properties of matter and facilitates distinguishing various objects, demonstrating substantial potential in a wide range of applications, including medical diagnosis and remote sensing. However, HSI exhibits variable number of channels due to the variations in acquisition equipments, which makes existing HSI analytical methods fail to utilize data from multiple equipments. To address this challenge, we first distill HSIs with varying channels into principal and residual components. We then develop a Fusion-Guided Network (FGNet) to transform the two distilled components into fused images with a fixed number of channels and perform channel-adaptive HSI analysis. To enable the fused images to maintain intensity, structure, and texture information in the original HSI, we generate pseudo labels to supervise the fusion. To facilitate the FGNet to extract more representative features, we further design a low-rank attention module (LGAM), leveraging the low-rank prior of HSI that few key information can represent a large amount of data. Moreover, the proposed framework can be applied as a plug-in to existing HSI analysis methods. We conducted extensive experiments on five HSI datasets including medical HSI segmentation task and remote sensing HSI classification task, which demonstrates the proposed method outperforms the state-of-the-art methods. We further experimentally identified that existing works can be seamlessly incorporated with our framework to achieve channel-adaptive ability and boost analytical performance. Code is available at https://github.com/hnsytq/FGNet.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103770"},"PeriodicalIF":15.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-26DOI: 10.1016/j.inffus.2025.103781
Illia Fedorin
{"title":"Virtual PPG reconstruction from accelerometer data via adaptive denoising and cross-Modal fusion","authors":"Illia Fedorin","doi":"10.1016/j.inffus.2025.103781","DOIUrl":"10.1016/j.inffus.2025.103781","url":null,"abstract":"<div><div>Accurate heart rate (HR) monitoring during high-intensity activity is essential for performance optimization and physiological tracking in wearable devices. While photoplethysmography (PPG) remains the standard for HR estimation, it is prone to motion artifacts, power constraints, and temporary signal loss. Accelerometers (ACC), by contrast, offer motion-resilient and energy-efficient sensing, but estimating HR from ACC alone remains a challenging task. In this study, we introduce a cross-modal virtual sensing framework for HR estimation and spectral reconstruction using only ACC signals. The framework includes: (1) a high-fidelity variational autoencoder (VAE) for offline PPG spectrum reconstruction from ACC input, and (2) a lightweight real-time attention-based denoising model for HR prediction. Both models are trained with a fusion-aware loss to enforce alignment between motion-driven and cardiovascular signal features. Experimental results on public and proprietary datasets demonstrate strong performance and generalization under varying sensor configurations and motion conditions. The real-time model achieves 7.0 BPM mean absolute error (MAE) with only 2.6K parameters, making it suitable for embedded deployment. While PPG remains superior under ideal conditions, the proposed system serves as a fallback modality when optical sensing is unreliable or unavailable-enabling gap-filling, post-processing correction, and low-power monitoring. More broadly, this work positions virtual PPG reconstruction as a proof-of-concept for physiological virtual sensing: a paradigm where one modality can be inferred from another, and potentially reversed, supporting robust multimodal inference in real-world mobile health scenarios.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103781"},"PeriodicalIF":15.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-26DOI: 10.1016/j.inffus.2025.103780
Guodong Fan , Shuteng Hu , Jingchun Zhou , Min Gan , C. L Phlip Chen
{"title":"Color and texture count alike: An underwater image enhancement method via dual-attention fusion","authors":"Guodong Fan , Shuteng Hu , Jingchun Zhou , Min Gan , C. L Phlip Chen","doi":"10.1016/j.inffus.2025.103780","DOIUrl":"10.1016/j.inffus.2025.103780","url":null,"abstract":"<div><div>Underwater image enhancement is a highly challenging task, requiring solutions to complex environmental degradation factors such as light attenuation and color cast. Achieving stability in color restoration and precision in texture recovery is key to improving enhancement results. However, existing methods generally lack in-depth modeling of color and texture information and fail to efficiently fuse these two core visual components, significantly limiting the overall performance of the enhancement results. To this end, we propose an innovative Dual-Attention Fusion Net (DuAF) that solves this problem. On a global scale, DuAF introduces explicit semantic consistency constraints to precisely model color features by reconstructing pixel intensity distribution, enhancing sensitivity to color features, and capturing real pixel gradient changes, effectively addressing complex color distortion issues. On a local scale, DuAF dynamically adjusts the perception window, combines optimized attention weights with positional deviations, and deeply models texture information, significantly improving the restoration of texture details. Overall, DuAF significantly improves the stability of color restoration and the clarity of texture details in complex degraded scenes, providing an efficient and comprehensive solution for underwater image enhancement. Our project is publicly available on <span><span>https://github.com/HuShuteng/DuAF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103780"},"PeriodicalIF":15.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-09-26DOI: 10.1016/j.inffus.2025.103709
Andrea Moglia , Matteo Leccardi , Matteo Cavicchioli , Alice Maccarini , Marco Marcon , Luca Mainardi , Pietro Cerveri
{"title":"Generalist models in medical image segmentation: A survey and performance comparison with task-specific approaches","authors":"Andrea Moglia , Matteo Leccardi , Matteo Cavicchioli , Alice Maccarini , Marco Marcon , Luca Mainardi , Pietro Cerveri","doi":"10.1016/j.inffus.2025.103709","DOIUrl":"10.1016/j.inffus.2025.103709","url":null,"abstract":"<div><div>Following the successful paradigm shift of large language models, which leverages pre-training on a massive corpus of data and fine-tuning on various downstream tasks, generalist models have made their foray into computer vision. The introduction of the Segment Anything Model (SAM) marked a milestone in the segmentation of natural images, inspiring the design of numerous architectures for medical image segmentation. In this survey, we offer a comprehensive and in-depth investigation of generalist models for medical image segmentation. We begin with an introduction to the fundamental concepts that underpin their development. Then, we provide a taxonomy based on features fusion on the different declinations of SAM in terms of zero-shot, few-shot, fine-tuning, adapters, on SAM2, on other innovative models trained on images alone, and others trained on both text and images. We thoroughly analyze their performances at the level of both primary research and best-in-literature, followed by a rigorous comparison with the state-of-the-art task-specific models. We emphasize the need to address challenges in terms of compliance with regulatory frameworks, privacy and security laws, budget, and trustworthy artificial intelligence (AI). Finally, we share our perspective on future directions concerning synthetic data, early fusion, lessons learnt from generalist models in natural language processing, agentic AI, physical AI, and clinical translation. We publicly release a database-backed interactive app with all survey data (<span><span>https://hal9000-lab.github.io/GMMIS-Survey/</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103709"},"PeriodicalIF":15.5,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145229532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}