THPFF: A tensor-based high-precision feature fusion model for multi-source data in smart healthcare systems

IF 15.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2025-05-29 DOI:10.1016/j.inffus.2025.103324

Songhe Yuan , Laurence T. Yang , Debin Liu , Xiaokang Wang , Jieming Yang

{"title":"THPFF: A tensor-based high-precision feature fusion model for multi-source data in smart healthcare systems","authors":"Songhe Yuan , Laurence T. Yang , Debin Liu , Xiaokang Wang , Jieming Yang","doi":"10.1016/j.inffus.2025.103324","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning has revolutionized the field of medical analysis. However, its progress is often constrained by the heterogeneity of multi-sensor data and the lack of a unified predictive architecture. Visual Prompting (VP) emerges as a promising approach, enabling the efficient transfer of fusion knowledge from pre-trained models with lower computational costs. The collision between VP and the medical field may yield unexpected results. This study delves into the potential of VP in the realm of medical image recognition, introducing a novel method names Dual Visual Prompt (DVP), which consisted of Image-Feature Visual Prompting (IF-VP). This approach innovates by fusing input-level and feature-level prompts into a frozen image encoder, thereby boosting its learning efficacy across both CNN and CLIP based VP. For Feature Prompts (FP), we propose an innovative methodology, employing the Adaptive Energy-Weighted Tensor Decomposition (FP-AEWTD) technique to optimize feature extraction processes. Furthermore, we have devised a Border Merging (BM) strategy that fortifies the stability of pre-trained classifiers’ label confidence, specifically under CNN-based VP. IF-VP’s performance was rigorously assessed across 12 distinct medical image recognition tasks, demonstrating its potential to be both a precise and resource-efficient. Especially on the ABIDE dataset, VP-based training exhibited superior performance over Full-finetune, achieving improvements of up to 11.9% on ResNet-18 and 7.7% on ResNeXt-101-32 × 8d. This research paves the way for further explorations into the scalability and adaptability of VP techniques in medical applications, potentially leading to broader implementations and innovations for smart healthcare.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103324"},"PeriodicalIF":15.5000,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525003975","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning has revolutionized the field of medical analysis. However, its progress is often constrained by the heterogeneity of multi-sensor data and the lack of a unified predictive architecture. Visual Prompting (VP) emerges as a promising approach, enabling the efficient transfer of fusion knowledge from pre-trained models with lower computational costs. The collision between VP and the medical field may yield unexpected results. This study delves into the potential of VP in the realm of medical image recognition, introducing a novel method names Dual Visual Prompt (DVP), which consisted of Image-Feature Visual Prompting (IF-VP). This approach innovates by fusing input-level and feature-level prompts into a frozen image encoder, thereby boosting its learning efficacy across both CNN and CLIP based VP. For Feature Prompts (FP), we propose an innovative methodology, employing the Adaptive Energy-Weighted Tensor Decomposition (FP-AEWTD) technique to optimize feature extraction processes. Furthermore, we have devised a Border Merging (BM) strategy that fortifies the stability of pre-trained classifiers’ label confidence, specifically under CNN-based VP. IF-VP’s performance was rigorously assessed across 12 distinct medical image recognition tasks, demonstrating its potential to be both a precise and resource-efficient. Especially on the ABIDE dataset, VP-based training exhibited superior performance over Full-finetune, achieving improvements of up to 11.9% on ResNet-18 and 7.7% on ResNeXt-101-32 × 8d. This research paves the way for further explorations into the scalability and adaptability of VP techniques in medical applications, potentially leading to broader implementations and innovations for smart healthcare.

查看原文本刊更多论文

THPFF：智能医疗系统中基于张量的多源数据高精度特征融合模型

深度学习已经彻底改变了医学分析领域。然而，它的进展往往受到多传感器数据的异构性和缺乏统一的预测架构的限制。视觉提示（VP）作为一种很有前途的方法，能够以较低的计算成本从预训练的模型中高效地转移融合知识。VP与医学领域的碰撞可能会产生意想不到的结果。本研究探讨了二元视觉提示在医学图像识别领域的潜力，提出了一种新的方法——二元视觉提示（DVP），它由图像特征视觉提示（IF-VP）组成。该方法的创新之处在于将输入级和特征级提示融合到冻结图像编码器中，从而提高了其在CNN和基于CLIP的VP中的学习效率。对于特征提示（FP），我们提出了一种创新的方法，采用自适应能量加权张量分解（FP- aewtd）技术来优化特征提取过程。此外，我们还设计了一种边界合并（BM）策略，该策略加强了预训练分类器标签置信度的稳定性，特别是在基于cnn的VP下。在12种不同的医学图像识别任务中对IF-VP的性能进行了严格评估，证明了其精确和资源高效的潜力。特别是在ABIDE数据集上，基于vp的训练表现出优于Full-finetune的性能，在ResNet-18和ResNeXt-101-32 × 8d上分别取得了11.9%和7.7%的改进。这项研究为进一步探索VP技术在医疗应用中的可扩展性和适应性铺平了道路，有可能为智能医疗带来更广泛的实施和创新。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.