Information Fusion最新文献_第6页

ModiFedCat: A multi-modal distillation based federated catalytic framework ModiFedCat：一个基于多模态蒸馏的联邦催化框架

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-13 DOI: 10.1016/j.inffus.2025.103378

Dongdong Li, Zhenqiang Weng, Zhengji Xuan, Zhe Wang

{"title":"ModiFedCat: A multi-modal distillation based federated catalytic framework","authors":"Dongdong Li, Zhenqiang Weng, Zhengji Xuan, Zhe Wang","doi":"10.1016/j.inffus.2025.103378","DOIUrl":"10.1016/j.inffus.2025.103378","url":null,"abstract":"<div><div>The integration of multi-modal data in federated learning systems faces significant challenges in balancing privacy preservation with effective cross-modal correlation learning under strict client isolation constraints. We propose ModiFedCat, a novel curriculum-guided multi-modal federated distillation framework that combines hierarchical knowledge transfer with adaptive training scheduling to enhance client-side model performance while maintaining rigorous data privacy. Our method computes multi-modal knowledge distillation losses at both the feature extraction and output layers, ensuring that local models are consistently aligned with the global model throughout training. Additionally, we introduce a unique catalyst strategy that dynamically schedules the integration of the distillation loss. By initially training the global model without distillation, we determine the optimal timing for its introduction, thereby maximizing the effectiveness of knowledge transfer once local models have stabilized. Experimental results on three benchmark datasets, AV-MNIST, MM-IMDB, and MIMIC III, demonstrate that ModiFedCat outperforms existing multi-modal federated learning methods. The proposed framework significantly improves the fusion capability of multi-modal models while maintaining client data privacy. This approach balances local adaptation and global knowledge integration, making it a robust solution for multi-modal federated learning scenarios.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103378"},"PeriodicalIF":14.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ISP-free multi-spectrum fused imaging for extremely low-light enhanced photography 无isp多光谱融合成像极低光增强摄影

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-13 DOI: 10.1016/j.inffus.2025.103400

Yilan Nan, Qican Zhang, Tingdong Kou, Tianyue He, Cui Huang, Cuizhen Lu, Junfei Shen

{"title":"ISP-free multi-spectrum fused imaging for extremely low-light enhanced photography","authors":"Yilan Nan, Qican Zhang, Tingdong Kou, Tianyue He, Cui Huang, Cuizhen Lu, Junfei Shen","doi":"10.1016/j.inffus.2025.103400","DOIUrl":"10.1016/j.inffus.2025.103400","url":null,"abstract":"<div><div>Achieving high-quality imaging under extremely low-light conditions is crucial for autonomous driving and night surveillance applications. Traditional approaches predominantly focus on the post-processing of degraded RGB data, which struggle to effectively mitigate noise in very low-light situations with limited input information and significant noise interference. In this study, a computational multi-spectral fusion imaging framework is proposed to enhance low-light images by encoding a broader spectrum of optical source information into the imaging pipeline. An end-to-end spectral fusion network (SPFNet), consisting of an encoder for the automatic extraction of scene features and decoder for channel fusion, is designed to integrate spectral fusion with image denoising. Utilizing the novel Multi_Conv module, a diverse range of spectral features are extracted from multi-spectral raw data, providing multi-scale cross-references for noise suppression, thereby facilitating high-quality image fusion. A pilot optical system was built to capture a real-scene multi-spectral-RGB dataset under illuminance conditions below 0.01 lx per spectrum. Experimental results confirm that the proposed method significantly outperforms traditional RGB imaging techniques, demonstrating an average improvement of over 7.87 dB in peak signal-to-noise ratio (PSNR) and 0.25 in structural similarity index (SSIM). Comprehensive ablation and contrast experiments were conducted to verify that the proposed model achieved the best performance in terms of detail reconstruction and color fidelity. Eschewing the need for a cumbersome traditional image signal processing (ISP) pipeline and strict experimental constraints, the proposed framework offers a novel and viable solution for extreme low-light imaging applications, including portable photography, space exploration, remote sensing, and deep-sea exploration.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103400"},"PeriodicalIF":14.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144304945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Memory association guided unsupervised anomaly detection with adaptive 3D attention 记忆关联引导自适应三维注意的无监督异常检测

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-13 DOI: 10.1016/j.inffus.2025.103379

Xu Liu, Chunlei Wu, Huan Zhang, Leiquan Wang

{"title":"Memory association guided unsupervised anomaly detection with adaptive 3D attention","authors":"Xu Liu, Chunlei Wu, Huan Zhang, Leiquan Wang","doi":"10.1016/j.inffus.2025.103379","DOIUrl":"10.1016/j.inffus.2025.103379","url":null,"abstract":"<div><div>Unsupervised anomaly detection has recently made significant progress in various anomaly detection tasks, including multi-normal class anomaly detection and industrial defect detection. However, existing methods often construct simple feature spaces that struggle to disentangle the abundant anomalous information interwoven with the reconstruction information. To ensure the normalcy of the image feature space, we propose a Memory Association Module-based generator to activate deep interactive memory feature spaces, thereby enhancing the representation of normal feature information. Furthermore, we construct a feature simulation network that utilizes deep feature progressive fusion blocks to capture multi-scale information from the reconstructed image and subsequently corrects the vectors outputted by the memory feature space. Considering the challenges faced by existing methods in identifying edge information and blurry regions within defective images, we propose an adaptive 3D attention module and integrate it into the overall anomaly detection network architecture to enhance the network’s ability to identify hard-to-detect defective areas in images.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103379"},"PeriodicalIF":14.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SpaFusion: A multi-level fusion model for clustering spatial multi-omics data SpaFusion：空间多组学数据聚类的多层次融合模型

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-13 DOI: 10.1016/j.inffus.2025.103372

Fuqun Chen , Guangchang Cai , Ying Li , Le Ou-Yang

{"title":"SpaFusion: A multi-level fusion model for clustering spatial multi-omics data","authors":"Fuqun Chen , Guangchang Cai , Ying Li , Le Ou-Yang","doi":"10.1016/j.inffus.2025.103372","DOIUrl":"10.1016/j.inffus.2025.103372","url":null,"abstract":"<div><div>Cell type identification is crucial for understanding cellular organization and elucidating the mechanisms underlying disease. Recent advances in spatial multi-omics sequencing technologies have enabled the simultaneous profiling of transcriptomics and proteomics data at shared spatial coordinates, providing new opportunities for cell type identification through spatial omics clustering. However, most existing methods primarily focus on clustering spatial transcriptomics data, and effectively integrating multi-omics data for precise cell clustering remains a critical challenge. In this study, we introduce SpaFusion, a novel multi-level fusion model for clustering spatial multi-omics data. We first construct a high-order cell graph to capture more comprehensive relationships between cells. To extract latent features from both local and global perspectives, we propose an architecture that integrates graph autoencoders and transformer modules. Through a multi-level information fusion strategy, SpaFusion captures both omic-specific features and a unified consensus representation across omics. Finally, a consensus clustering strategy is introduced to facilitate information exchange across hierarchical latent representations, thereby enhancing clustering accuracy. Extensive experiments on three real-world spatial transcriptome–proteome datasets demonstrate that SpaFusion consistently outperforms state-of-the-art methods and provides valuable insights into the spatial organization of cell types and their potential roles in disease mechanisms.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103372"},"PeriodicalIF":14.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Syntactic paraphrase-based synthetic data generation for backdoor attacks against Chinese language models 基于句法释义的汉语模型后门攻击合成数据生成

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-12 DOI: 10.1016/j.inffus.2025.103376

Man Hu , Yatao Yang , Deng Pan , Zhongliang Guo , Luwei Xiao , Deyu Lin , Shuai Zhao

{"title":"Syntactic paraphrase-based synthetic data generation for backdoor attacks against Chinese language models","authors":"Man Hu , Yatao Yang , Deng Pan , Zhongliang Guo , Luwei Xiao , Deyu Lin , Shuai Zhao","doi":"10.1016/j.inffus.2025.103376","DOIUrl":"10.1016/j.inffus.2025.103376","url":null,"abstract":"<div><div>Language Models (LMs) have shown significant advancements in various Natural Language Processing (NLP) tasks. However, recent studies indicate that LMs are particularly susceptible to malicious backdoor attacks, where attackers manipulate the models to exhibit specific behaviors when they encounter particular triggers. While existing research has focused on backdoor attacks against English LMs, Chinese LMs remain largely unexplored. Moreover, existing backdoor attacks against Chinese LMs exhibit limited stealthiness. In this paper, we investigate the high detectability of current backdoor attacks against Chinese LMs and propose a more stealthy backdoor attack method based on syntactic paraphrasing. Specifically, we leverage large language models (LLMs) to construct a syntactic paraphrasing mechanism that transforms benign inputs into poisoned samples with predefined syntactic structures. Subsequently, we exploit the syntactic structures of these poisoned samples as triggers to create more stealthy and robust backdoor attacks across various attack strategies. Extensive experiments conducted on three major NLP tasks with various Chinese PLMs and LLMs demonstrate that our method can achieve comparable attack performance (almost 100% success rate). Additionally, the poisoned samples generated by our method show lower perplexity and fewer grammatical errors compared to traditional character-level backdoor attacks. Furthermore, our method exhibits strong resistance against two state-of-the-art backdoor defense mechanisms.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103376"},"PeriodicalIF":14.7,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144314632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Differentially private synthetic data generation for robust information fusion 基于鲁棒信息融合的差分私有合成数据生成

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-12 DOI: 10.1016/j.inffus.2025.103373

Xiaohong Cai, Yi Sun, Zhaowen Lin, Ripeng Li, Tianwei Cai

{"title":"Differentially private synthetic data generation for robust information fusion","authors":"Xiaohong Cai, Yi Sun, Zhaowen Lin, Ripeng Li, Tianwei Cai","doi":"10.1016/j.inffus.2025.103373","DOIUrl":"10.1016/j.inffus.2025.103373","url":null,"abstract":"<div><div>Synthetic data is crucial in information fusion in term of enhancing data representation and improving system robustness. Among all synthesis methods, deep generative models exhibit excellent performance. However, recent studies have shown that the generation process faces privacy challenges due to the memorization of training instances by generative models. To maximize the benefits of synthesis data while ensuring data security, we propose a novel framework for the generation and utilization of private synthetic data in information fusion processes. Furthermore, we present differential private adaptive fine-tuning (DP-AdaFit), a method for private parameter efficient fine-tuning that applies differential privacy only to the singular values of the incremental updates. In details, DP-AdaFit adaptively adjusts the rank of the low-rank weight increment matrices according to their importance score, and allows us to achieve an equivalent privacy policy by only injecting noise into gradient of the corresponding singular values. Such a novel approach essentially reduces their parameter budget but avoids too much noise introduced by the singular value decomposition. We decrease the cost on memory and computation nearly half of the SOTA, and achieve the FID of 19.2 on CIFAR10. Our results demonstrate that trading off weights contained in the differential privacy fine-tuning parameters can improve model performance, even achieving generation quality competitive with differential privacy full fine-tuning diffusion model. Our code is available at <span><span>DP-AdaFit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103373"},"PeriodicalIF":14.7,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144288925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing skin cancer detection through category representation and fusion of pre-trained models 通过类别表示和预训练模型融合增强皮肤癌检测

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-11 DOI: 10.1016/j.inffus.2025.103369

Lingping Kong , Juan D. Velasquez , Václav Snášel , Millie Pant , Jeng-Shyang Pan , Jana Nowakova

{"title":"Enhancing skin cancer detection through category representation and fusion of pre-trained models","authors":"Lingping Kong , Juan D. Velasquez , Václav Snášel , Millie Pant , Jeng-Shyang Pan , Jana Nowakova","doi":"10.1016/j.inffus.2025.103369","DOIUrl":"10.1016/j.inffus.2025.103369","url":null,"abstract":"<div><div>The use of pre-trained models in medical image classification has gained significant attention due to their ability to handle complex datasets and improve accuracy. However, challenges such as domain-specific customization, interpretability, and computational efficiency remain critical, especially in high-stakes applications such as skin cancer detection. In this paper, we introduce a novel interpretability-assisted fine-tuning framework that leverages category representation to enhance both model accuracy and transparency.</div><div>Using the widely known HAM10000 dataset, which includes seven imbalanced categories of skin cancer, we demonstrate that our method improves the classification accuracy by 2.6% compared to standard pre-trained models. In addition to precision, we also achieve significant improvements in interpretability, with our category representation framework providing more understandable insights into the model’s decision-making process. Key metrics, such as precision and recall, show enhanced performance, particularly for underrepresented skin cancer types such as Melanocytic Nevi (F1 score of 0.94) and Actinic Keratosis (F1 score of 0.66).</div><div>Furthermore, the prediction accuracy of the proposed model of the top-3 reaches 98. 21%, which is highly significant for medical decision making. This observation in interpretability underscores the value of top-<span><math><mi>n</mi></math></span> predictions, especially in challenging cases, to support more informed and accurate decisions. The proposed fusion framework not only enhances predictive accuracy, but also offers an interpretable model output that can assist clinicians in making informed decisions. This makes our approach particularly relevant in medical applications, where both accuracy and transparency are crucial. Our results highlight the potential of fusing pretrained models with category representation techniques to bridge the gap between performance and interpretability in AI-driven healthcare solutions.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103369"},"PeriodicalIF":14.7,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interval-valued information fusing via dynamic three-way decision model and regret theory 基于动态三向决策模型和后悔理论的区间值信息融合

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-10 DOI: 10.1016/j.inffus.2025.103364

Jiajia Wang , Xiaonan Li , Jianming Zhan , Huangjian Yi

{"title":"Interval-valued information fusing via dynamic three-way decision model and regret theory","authors":"Jiajia Wang , Xiaonan Li , Jianming Zhan , Huangjian Yi","doi":"10.1016/j.inffus.2025.103364","DOIUrl":"10.1016/j.inffus.2025.103364","url":null,"abstract":"<div><div>In the realm of practical decision-making, the challenge of aggregating information across multiple time periods is highlighted by the inadequacy of traditional, static methods to handle such dynamic complexities. This paper addresses this challenge by introducing a dynamic three-way decision model that integrates dynamic interval-valued fuzzy information with regret theory, emphasizing the critical role of information fusion in modern decision-making. First, an adaptive time-weight method is proposed to take into account for the varying importance of information over different periods, reflecting the evolving nature of data in real-world scenarios. Second, the paper employs a fuzzy c-means algorithm and an embedding degree to describe objects, and this description forms the basis for estimating conditional probabilities, which provides a more nuanced understanding than traditional models. Besides, the introduction of an interval additive generator pair of overlap functions further broadens the model’s application scope by fusing information grains. Meanwhile, the attribute-weight vector is determined through a multi-objective programming model, considering both embeddedness and deviation, which is essential for reflecting the relative importance of different attributes. Finally, the model aims to maximize utility by incorporating psychological behaviors, aligning with the realities of human decision-making processes. Its practicality, rationality, and superiority are demonstrated through its application to city air quality assessment problems, where comparative and experimental analyses verify the model’s effectiveness in handling complex, real-world decision-making challenges. In summary, this paper presents a dynamic decision-making model that underscores the importance of information fusion in addressing aggregated data across time periods, enhancing the human-centric decision-making process.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103364"},"PeriodicalIF":14.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144281073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Side information-guided deep unfolding network based on self-supervised learning for dual-camera compressive hyperspectral imaging 基于自监督学习的侧信息引导深度展开网络双相机压缩高光谱成像

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-10 DOI: 10.1016/j.inffus.2025.103366

Heng Jiang , Dongdong Teng , Chen Xu , Lilin Liu

{"title":"Side information-guided deep unfolding network based on self-supervised learning for dual-camera compressive hyperspectral imaging","authors":"Heng Jiang , Dongdong Teng , Chen Xu , Lilin Liu","doi":"10.1016/j.inffus.2025.103366","DOIUrl":"10.1016/j.inffus.2025.103366","url":null,"abstract":"<div><div>Dual-Camera Compressive Hyperspectral Imaging (DCCHI) extends the CASSI system by adding a complementary panchromatic (PAN) camera to enhance hyperspectral image (HSI) reconstruction. Its potential broad applications make it attract great attentions. However, current DCCHI reconstruction methods face several challenges: 1) Existing regularization methods, based on hand-crafted priors, struggle to capture the complex intrinsic structure of HSI. 2) Deep learning-based approaches require large amounts of ground truth data for training. 3) The rich structural information in PAN images is underutilized. To address these issues, in this work, a novel side information-guided deep unfolding self-supervised network model is proposed to reconstruct HSI solely from compressed snapshot measurements and PAN images. Firstly, a Guided Deep Spatial-Spectral Attention Network (GDSSAN) is designed as a regularizer for the DCCHI reconstruction problem. This network effectively captures deep spatial-spectral features of HSIs, leveraging the non-linear capabilities of neural networks. Secondly, a Guided Feature Extraction Pyramid (GFEP) Block is constructed to extract multi-scale features from the PAN image, guiding both the encoding and decoding processes at different scales to enhance the robustness and fidelity of HSI reconstruction. Thirdly, a dual-domain hybrid loss function is introduced, which integrates the physical imaging mechanism of DCCHI with the spatial-spectral fidelity of HSIs, further enhancing the model's reconstruction performance. Extensive simulations and real-world experiments demonstrate that the proposed method adapts to various experimental scenes without training requirement, showcasing its strong generalization ability. Moreover, it significantly outperforms existing state-of-the-art (SOTA) methods in both quantitative metrics and visual comparisons.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103366"},"PeriodicalIF":14.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144239450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Underwater sequential images enhancement via diffusion and physics priors fusion 基于扩散和物理先验融合的水下序列图像增强

IF 14.7 1区计算机科学

Information Fusion Pub Date : 2025-06-10 DOI: 10.1016/j.inffus.2025.103365

Haochen Hu, Yanrui Bin, Chih-yung Wen, Bing Wang

{"title":"Underwater sequential images enhancement via diffusion and physics priors fusion","authors":"Haochen Hu, Yanrui Bin, Chih-yung Wen, Bing Wang","doi":"10.1016/j.inffus.2025.103365","DOIUrl":"10.1016/j.inffus.2025.103365","url":null,"abstract":"<div><div>Although learning-based Underwater Image Enhancement (UIE) methods have demonstrated its remarkable performance, several issues remain to be addressed. A critical research gap is that different water effects are not properly removed, including color bias, low contrast, and blur. This is mainly due to the synthetic-real domain gap of the training data. They are either (1) real underwater images but with synthetic pseudo-labels or (2) synthetic underwater images although with accurate labels. However, it is extremely challenging to collect real-world data with true labels, where the water should be removed to obtain true references. Besides, the inter-frame consistency is not preserved because the previous works are designed for single-image enhancement. To address these two issues, a novel UIE framework fusing both diffusion and physics priors is present in this work. The extensive prior knowledge embedded in the pre-trained video diffusion model is leveraged for the first time to achieve zero-shot generalization from synthetic to real-world UIE task, including both single-frame quality and inter-frame consistency. In addition, a synthetic data augmentation strategy based on the physical imaging model is proposed to further alleviate the synthetic-real inconsistency. Qualitative and quantitative experiments on various real-world underwater scenes demonstrate the significance of our approach, producing results superior to existing works in terms of both visual fidelity and quantitative metrics.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103365"},"PeriodicalIF":14.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0