Information FusionPub Date : 2025-04-22DOI: 10.1016/j.inffus.2025.103163
Yan Zhong , Xinping Zhao , Guangzhi Zhao , Bohua Chen , Fei Hao , Ruoyu Zhao , Jiaqi He , Lei Shi , Li Zhang
{"title":"CTD-inpainting: Towards the Coherence of Text-driven Inpainting with Blended Diffusion","authors":"Yan Zhong , Xinping Zhao , Guangzhi Zhao , Bohua Chen , Fei Hao , Ruoyu Zhao , Jiaqi He , Lei Shi , Li Zhang","doi":"10.1016/j.inffus.2025.103163","DOIUrl":"10.1016/j.inffus.2025.103163","url":null,"abstract":"<div><div>Text-driven inpainting has emerged as a prominent and challenging research topic in image completion recently, where denoising diffusion probabilistic models (DDPM)-based approaches have achieved state-of-the-art performance on authentic and diverse images. However, ensuring high image fidelity during generation remains a critical aspect in effective text-driven inpainting. Moreover, guaranteeing coherence between the unmasked region (background) and the generated results in the masked regions poses a significant challenge in measurement and implementation. To address these issues, we propose CTD-Inpainting, a novel text-driven inpainting framework, incorporates a coherence constraint between the masked and unmasked regions. Specifically, CTD-Inpainting employs a pre-trained contrastive language-image model (CLIP) to guide DDPM-based generation, aligning it with the text prompt. Additionally, we introduce a transition region between the background and the masked region via mask expansion. This transition region helps maintain coherence between the foreground and background by ensuring consistency between the generated results and the original background during inpainting. At each denoising step, we employ a blending technique, where multiple noise-injected versions of the input image are harmonized with the latent diffusion guided by text and coherence constraint in the transition region. This enables seamless integration of conditional information with the generated information via resampling. Additionally, we design an innovative coherence metric based on the coherence constraint, providing a quantitative measure for the subjective coherence assessment. Extensive experiments manifest the superiority of CTD-Inpainting against state-of-the-art methods on real-world and diverse images.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103163"},"PeriodicalIF":14.7,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MelNet: an end-to-end adaptive network with adjustable frequency for preprocessing-free broadband acoustic emission signals","authors":"Jing Huang , Rui Qin , Zhifen Zhang , Zhengyao Du , Shuai Zhang , Yu Su , Guangrui Wen , Wei Cheng , Xuefeng Chen","doi":"10.1016/j.inffus.2025.103229","DOIUrl":"10.1016/j.inffus.2025.103229","url":null,"abstract":"<div><div>Deep learning can obtain discriminative abstract information from the artificial features of acoustic emission signals and complete the target task in the form of high-precision recognition. However, mainstream deep learning is only a mechanical implementation work, but cannot provide valuable feedback on signal analysis. To overcome this problem, this paper innovatively proposes an end-to-end interpretable model that highly integrates two-dimensional time-frequency representation with neural network feature extraction. The proposed model can be considered as an initial tool for signal analysis. Specifically, the core frequency used to control the resolution of time-frequency features will be fully involved in model training and gradient propagation. The adaptive learning algorithm not only accurately captures the key frequency characteristics of the signal, but also identifies and skips sub-optimal frequency points that do not bring significant performance improvements. More importantly, this visualization of the adaptive frequency selection process ensures that the extracted features are highly relevant to the task, thus improving the interpretability of the feature extraction stage. The feasibility of the proposed method was verified in two different cases: key equipment service monitoring and manufacturing process monitoring. The results show that the proposed method can optimize the optimal frequency component and obtain ideal monitoring accuracy without relying on any expert experience. While the deep learning model itself may not be inherently interpretable, its role in guiding the feature extraction process via gradient optimization introduces a level of interpretability absent in conventional methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103229"},"PeriodicalIF":14.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-21DOI: 10.1016/j.inffus.2025.103212
Quan Liu , Tianhao Wang , Qiwen Jin , Jiwei Hu , Li Li , Kin-Man Lam
{"title":"Bi-domain fusion pyramid network for pansharpening with deep anisotropic diffusion","authors":"Quan Liu , Tianhao Wang , Qiwen Jin , Jiwei Hu , Li Li , Kin-Man Lam","doi":"10.1016/j.inffus.2025.103212","DOIUrl":"10.1016/j.inffus.2025.103212","url":null,"abstract":"<div><div>Pansharpening is the process of fusing a low-resolution multi-spectral image with a high-resolution panchromatic (PAN) image to generate a high-resolution multi-spectral (HRMS) image. In this paper, a new bi-domain fusion pyramid network with deep anisotropic diffusion, termed as BFP-Net, is proposed to generate a high-quality HRMS image with accurate spectral distribution as well as reasonable spatial structure. Different from previous deep models that solely rely on the supervision of the HRMS reference image, the bi-directional information flow mechanism of our network effectively enlarges the receptive field and addresses resolution differences between input images. Bi-domain fusion integrates spatial-frequency domain information with encouraging model to learn complementary representations. Furthermore, the introduction of deep anisotropic diffusion adaptively preserves and enhances edge details, thereby enhancing the visual quality and structural consistency of the target image. The loss functions ensure the network’s training direction, which enhances its generalization across different datasets and produces more robust and accurate results. Extensive quantitative and qualitative experiments on real datasets demonstrate the superiority of our method over existing methods in terms of performance, showing excellent sharpening quality and spectral consistency. The source code is available at <span><span>https://github.com/qiwenjjin/BFP-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103212"},"PeriodicalIF":14.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-21DOI: 10.1016/j.inffus.2025.103215
Genan Dai , Wenfeng Yi , Jinzhou Cao , Zhaoya Gong , Xianghua Fu , Bowen Zhang
{"title":"CRRL: Contrastive Region Relevance Learning Framework for Cross-city Traffic Prediction","authors":"Genan Dai , Wenfeng Yi , Jinzhou Cao , Zhaoya Gong , Xianghua Fu , Bowen Zhang","doi":"10.1016/j.inffus.2025.103215","DOIUrl":"10.1016/j.inffus.2025.103215","url":null,"abstract":"<div><div>Accurate traffic prediction plays a pivotal role in smart urban systems by enabling effective traffic management and improving the quality of life for residents. While intra-city traffic prediction has been extensively studied, cross-city traffic prediction remains a challenging task due to data scarcity in target cities, domain gaps between cities, and the risk of negative transfer. To address these challenges, we propose a novel Contrastive Region Relevance Learning (CRRL) framework. CRRL leverages contrastive learning to align region-level spatiotemporal patterns and transfer high-quality knowledge between cities. Specifically, CRRL integrates three key modules: (1) a Dual-branch Spatiotemporal Encoder (DSE) to capture region-pair and high-order region group relationships, (2) a Pseudo-Label Generation (PLG) module for aligning cross-city embedding similarities, and (3) a Reliable Region Selection (RRS) module for contrastive learning within consistent regions. Extensive experiments on real-world datasets demonstrate that CRRL achieves state-of-the-art performance in cross-city traffic prediction under data-scarce scenarios, showcasing its practicality and effectiveness in addressing urban traffic challenges.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103215"},"PeriodicalIF":14.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fusion of quantum computing and explainable AI: A comprehensive survey on transformative healthcare solutions","authors":"Shashank Sheshar Singh, Sumit Kumar, Rohit Ahuja, Jayendra Barua","doi":"10.1016/j.inffus.2025.103217","DOIUrl":"10.1016/j.inffus.2025.103217","url":null,"abstract":"<div><div>Rapid technological advancements in healthcare have significantly enhanced diagnostics and patient care. The exponential growth of complex medical data requires innovative architectures to ensure efficient processing and real-world deployment. To address this need, quantum computing is revolutionizing healthcare applications by efficiently processing high-dimensional data. At the same time, artificial intelligence (AI) enhances disease detection and treatment planning; however, its black-box nature reduces trust. Explainable artificial intelligence (XAI) bridges this gap by improving interpretability and accountability in healthcare applications. This study explores the fusion of quantum computing with XAI (QXAI) to support improved medical decision-making. It presents a comprehensive survey of state-of-the-art quantum-based, XAI-based, and QXAI-based techniques for healthcare solutions using a survey research questions-based methodology. We critically evaluate the capability of QXAI models to analyze vast and high-dimensional healthcare data using case studies. This work also highlights major challenges in the adoption of QXAI, such as privacy, ethical concerns, and standardization. Through a comprehensive analysis of recent developments, this review outlines key research directions to make QXAI more interpretable, reliable, and ethically compliant in real-world medical applications. Thus, this review provides valuable insights for researchers and offers practical guidelines for implementing quantum-enabled XAI models in real-world healthcare environments.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103217"},"PeriodicalIF":14.7,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143859998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-19DOI: 10.1016/j.inffus.2025.103209
Jialin Zhang , Xianfeng Yuan , Chaoqun Wang , Yong Song , Wenfeng Nie , Fengyu Zhou , Weihua Sheng
{"title":"RF-Fusion: Robust and fine-grained volumetric panoptic mapping system with multi-level data fusion for robots","authors":"Jialin Zhang , Xianfeng Yuan , Chaoqun Wang , Yong Song , Wenfeng Nie , Fengyu Zhou , Weihua Sheng","doi":"10.1016/j.inffus.2025.103209","DOIUrl":"10.1016/j.inffus.2025.103209","url":null,"abstract":"<div><div>For mobile robots to autonomously traverse the scene and perform interactions with the environment, perceiving and understanding the surroundings is essential. Many current incremental panoptic mapping frameworks rely on depth map segmentation algorithm to discover 3D objects. However, depth maps cannot reflect the information of objects with vanishing relative depths, and the depth map acquired by a consumer-grade RGB-D camera degrades with increasing distance, adversely affecting algorithm performance when relying solely on this input. This paper presents a novel incremental panoptic mapping framework that deals with the above problems by fusing multi-source information. We first propose a panoptic mask fusion algorithm to discover 3D objects by fusing information from color images and depth maps, which not only notably reduces the impact of long-distance depth map degradation, but also additionally discovers objects with relatively low depth. To further improve the mapping accuracy, a 3D instance fusion algorithm that incorporates semantic information, geometric information and priori knowledge is designed to achieve map regularization. Extensive experiments demonstrate that compared to the state-of-the-art incremental panoptic mapping framework, our method improves the average panoptic quality of thing classes by 4.1% and 3% on the SceneNN and ScanNet v2 dataset, respectively. Our approach even discovers many objects that are not annotated in the datasets, but are truly present. Furthermore, evaluations were conducted on a robot platform in different real-world scenarios characterized by low-depth objects and significant depth map degradation, demonstrating the reliability of our approach for real robot environment perception.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103209"},"PeriodicalIF":14.7,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-19DOI: 10.1016/j.inffus.2025.103169
Sai Laung Kham , Kosin Chamnongthai , Nattapol Aunsri
{"title":"A review of efficient techniques and applications of kernel density in particle filtering framework","authors":"Sai Laung Kham , Kosin Chamnongthai , Nattapol Aunsri","doi":"10.1016/j.inffus.2025.103169","DOIUrl":"10.1016/j.inffus.2025.103169","url":null,"abstract":"<div><div>Particle filter (PF), a sequential Monte Carlo (SMC) paradigm, is a powerful technique for managing non-linear dynamics and non-Gaussian noise, which is complex or irregular noise in measurements as well as for predicting the unobserved real state of systems. The standard PF algorithm comprises three primary stages: particle prediction, probabilities or weights adjustment, and resampling to regenerate particles. Particle degeneracy, in which most particles gain insignificant weights over multiple iterations, and particle impoverishment, in which the algorithm’s efficacy is reduced due to a lack of particle diversity, are two major issues in PF. To enhance the performance of posterior probability density function (PDF) estimation in dynamic systems, this work explores the integration of kernel density estimation (KDE) into PF. KDE, a non-parametric statistical smoothing method, is included into PF to smooth the resampled particles following each time step to overcome these difficulties. Thus, this paper investigates KDE-integration methods to improve estimation performance of the PF. This results in a more resilient PF algorithm by maintaining the diversity of particles and avoiding the loss of important information. In this process, choosing the right kernel bandwidth is essential because it controls the smoothness of the distribution. The KDE-PF approach demonstrates higher accuracy and dependability in challenging real-world tracking problems. This method offers useful insights for both research and industrial innovation, with potential applications across various sectors.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103169"},"PeriodicalIF":14.7,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-19DOI: 10.1016/j.inffus.2025.103208
Yanran Fu , Junshan Han , Yong Xu , Kunhong Liu , Jiang Lu , Song He , Xiaochen Bo
{"title":"RL-GCL: Reinforcement Learning-Guided Contrastive Learning for molecular property prediction","authors":"Yanran Fu , Junshan Han , Yong Xu , Kunhong Liu , Jiang Lu , Song He , Xiaochen Bo","doi":"10.1016/j.inffus.2025.103208","DOIUrl":"10.1016/j.inffus.2025.103208","url":null,"abstract":"<div><div>High-quality molecular characterizations for downstream tasks, such as molecular property prediction and drug design, have been effectively achieved through graph contrastive learning in biomolecules. However, many previous studies have treated molecules as generic graphs while generating augmentations, often ignoring their intrinsic molecular significance. This simplification can alter molecular properties by disrupting important functional groups. To address these challenges, we propose a novel approach called <strong>R</strong>einforcement <strong>L</strong>earning-<strong>G</strong>uided <strong>C</strong>ontrastive <strong>L</strong>earning (RL-GCL) by designing a specialized reward function. This reward function considers molecular similarity and label dissimilarity for positive sample pairs, ensuring that the generated molecular graph augmentations remain consistent with anchor labels while differing from molecular similarity. By doing so, RL-GCL generates valid, label-invariant, and hard molecules as augmentations. Furthermore, we incorporate label information into the contrastive loss function, enabling the model to more accurately distinguish between positive and negative samples. Extensive experiments demonstrate that RL-GCL significantly outperforms state-of-the-art baselines in graph classification across eight diverse datasets. Additionally, our method attempts to identify functional groups and substructures that are critical for specific task goals and validated on the HIV dataset.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103208"},"PeriodicalIF":14.7,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fusion of KANO theory and Attention-BiLSTM models for user demand analysis and trend prediction","authors":"Jinghua Zhao , Yajie Huang , Juan Feng , Wanyu Xie , Khushbu Jain","doi":"10.1016/j.inffus.2025.103210","DOIUrl":"10.1016/j.inffus.2025.103210","url":null,"abstract":"<div><div>With the rapid advancement of the internet, predicting user demand trends in user-generated content (UGC) on social media platforms can help businesses better understand user preferences, guiding decision-making and reform efforts. This paper explores product innovation by introducing a UGC-based user demand prediction technique. Initially, the BERTopic model is used to extract product attributes from UGC. The KANO model is then applied to categorize various user demands. An Attention-BiLSTM model, which incorporates empirical mode decomposition (EMD) and other features, is employed to forecast fluctuations and development trends in user demand preferences. The model’s performance is validated using different prediction datasets. To assess its effectiveness, the proposed hybrid model is compared to several leading deep learning algorithms. The combination of the KANO model and Attention-BiLSTM facilitates a more comprehensive analysis of sentiment and demand changes. Additionally, the limitation of existing sentiment-based trend prediction methods – unable to address long-range dependency problems – is overcome. The paper demonstrates the effectiveness of the proposed model framework using UGC data from “Auto-home”, highlighting the model’s superiority in prediction. Compared to state-of-the-art methods, this research improves online review analysis from a temporal perspective. This approach offers valuable insights for analyzing users’ product demand and predicting emotional trends related to products.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103210"},"PeriodicalIF":14.7,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2025-04-17DOI: 10.1016/j.inffus.2025.103174
Zheqiu Hetu , Zhenyong Zhang , Mufeng Wang , Zidong Xv , Jie Meng
{"title":"CASI: Context-aware Automatic Semantic Inference by fusing video and network traffic information in industrial control systems","authors":"Zheqiu Hetu , Zhenyong Zhang , Mufeng Wang , Zidong Xv , Jie Meng","doi":"10.1016/j.inffus.2025.103174","DOIUrl":"10.1016/j.inffus.2025.103174","url":null,"abstract":"<div><div>The increasing adoption of Internet of Things devices in Industrial Control Systems (ICS) has provided both attackers and security professionals with new perspectives on these traditionally safety-critical systems. Our observations reveal that video captured by field-deployed cameras contains process variables (PVs) essential for both ICS attack and defense. Based on this insight, we propose a Context-Aware Semantic Inference (CASI) method. CASI integrates video information from cameras with network traffic to infer the physical semantics of PVs in network packets, serving as a tactical tool for attackers or security professionals. A key advantage of CASI is its non-intrusive nature, deriving PVs location in the packets and semantics solely from video and network traffic without interacting with the runtime ICS, which is both covert and ensures the uninterrupted operation of the system. We constructed an ICS testbed using real devices to replicate a simplified industrial process, collecting traffic from 2 industrial protocols and creating 4 datasets. We evaluated CASI’s precision and recall in detecting PVs within this context, comparing it with similar tools. Experimental results demonstrate that CASI achieves a recall rate of 1.00 in PV semantic extraction, significantly outperforming other tools. Furthermore, leveraging CASI-derived semantic information for data tampering attacks successfully disrupted monitoring and data acquisition programs, posing a severe security threat to the ICS.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"122 ","pages":"Article 103174"},"PeriodicalIF":14.7,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}