Songhe Yuan , Laurence T. Yang , Debin Liu , Xiaokang Wang , Jieming Yang
{"title":"THPFF: A tensor-based high-precision feature fusion model for multi-source data in smart healthcare systems","authors":"Songhe Yuan , Laurence T. Yang , Debin Liu , Xiaokang Wang , Jieming Yang","doi":"10.1016/j.inffus.2025.103324","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has revolutionized the field of medical analysis. However, its progress is often constrained by the heterogeneity of multi-sensor data and the lack of a unified predictive architecture. Visual Prompting (VP) emerges as a promising approach, enabling the efficient transfer of fusion knowledge from pre-trained models with lower computational costs. The collision between VP and the medical field may yield unexpected results. This study delves into the potential of VP in the realm of medical image recognition, introducing a novel method names Dual Visual Prompt (DVP), which consisted of Image-Feature Visual Prompting (IF-VP). This approach innovates by fusing input-level and feature-level prompts into a frozen image encoder, thereby boosting its learning efficacy across both CNN and CLIP based VP. For Feature Prompts (FP), we propose an innovative methodology, employing the Adaptive Energy-Weighted Tensor Decomposition (FP-AEWTD) technique to optimize feature extraction processes. Furthermore, we have devised a Border Merging (BM) strategy that fortifies the stability of pre-trained classifiers’ label confidence, specifically under CNN-based VP. IF-VP’s performance was rigorously assessed across 12 distinct medical image recognition tasks, demonstrating its potential to be both a precise and resource-efficient. Especially on the ABIDE dataset, VP-based training exhibited superior performance over Full-finetune, achieving improvements of up to 11.9% on ResNet-18 and 7.7% on ResNeXt-101-32 × 8d. This research paves the way for further explorations into the scalability and adaptability of VP techniques in medical applications, potentially leading to broader implementations and innovations for smart healthcare.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103324"},"PeriodicalIF":15.5000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525003975","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning has revolutionized the field of medical analysis. However, its progress is often constrained by the heterogeneity of multi-sensor data and the lack of a unified predictive architecture. Visual Prompting (VP) emerges as a promising approach, enabling the efficient transfer of fusion knowledge from pre-trained models with lower computational costs. The collision between VP and the medical field may yield unexpected results. This study delves into the potential of VP in the realm of medical image recognition, introducing a novel method names Dual Visual Prompt (DVP), which consisted of Image-Feature Visual Prompting (IF-VP). This approach innovates by fusing input-level and feature-level prompts into a frozen image encoder, thereby boosting its learning efficacy across both CNN and CLIP based VP. For Feature Prompts (FP), we propose an innovative methodology, employing the Adaptive Energy-Weighted Tensor Decomposition (FP-AEWTD) technique to optimize feature extraction processes. Furthermore, we have devised a Border Merging (BM) strategy that fortifies the stability of pre-trained classifiers’ label confidence, specifically under CNN-based VP. IF-VP’s performance was rigorously assessed across 12 distinct medical image recognition tasks, demonstrating its potential to be both a precise and resource-efficient. Especially on the ABIDE dataset, VP-based training exhibited superior performance over Full-finetune, achieving improvements of up to 11.9% on ResNet-18 and 7.7% on ResNeXt-101-32 × 8d. This research paves the way for further explorations into the scalability and adaptability of VP techniques in medical applications, potentially leading to broader implementations and innovations for smart healthcare.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.