嵌入式提示调整：增强医学图像预训练模型的校准。

IF 10.7 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2024-07-04 DOI:10.1016/j.media.2024.103258

Wenqiang Zu , Shenghao Xie , Qing Zhao , Guoqi Li , Lei Ma

{"title":"嵌入式提示调整：增强医学图像预训练模型的校准。","authors":"Wenqiang Zu , Shenghao Xie , Qing Zhao , Guoqi Li , Lei Ma","doi":"10.1016/j.media.2024.103258","DOIUrl":null,"url":null,"abstract":"<div><p>Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. <strong>Parameter-efficient fine-tuning (PEFT)</strong> methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain few-shot scenarios, e.g., medical image analysis, has not been fully explored. In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks. Furthermore, to alleviate the limitations of prompt introducing ways and approximation capabilities on Transformer architectures of mainstream prompt tuning methods, we propose the <strong>Embedded Prompt Tuning (EPT)</strong> method by embedding prompt tokens into the expanded channels. We also find that there are anomalies in the feature space distribution of foundation models during pre-training process, and prompt tuning can help mitigate this negative impact. To explain this phenomenon, we also introduce a novel perspective to understand prompt tuning: <strong>Prompt tuning is a distribution calibrator.</strong> And we support it by analysing patch-wise scaling and feature separation operations contained in EPT. Our experiments show that EPT outperforms several state-of-the-art fine-tuning methods by a significant margin on few-shot medical image classification tasks, and completes the fine-tuning process within highly competitive time, indicating EPT is an effective PEFT method. The source code is available at <span>github.com/zuwenqiang/EPT</span><svg><path></path></svg>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"97 ","pages":"Article 103258"},"PeriodicalIF":10.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Embedded prompt tuning: Towards enhanced calibration of pretrained models for medical images\",\"authors\":\"Wenqiang Zu , Shenghao Xie , Qing Zhao , Guoqi Li , Lei Ma\",\"doi\":\"10.1016/j.media.2024.103258\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. <strong>Parameter-efficient fine-tuning (PEFT)</strong> methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain few-shot scenarios, e.g., medical image analysis, has not been fully explored. In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks. Furthermore, to alleviate the limitations of prompt introducing ways and approximation capabilities on Transformer architectures of mainstream prompt tuning methods, we propose the <strong>Embedded Prompt Tuning (EPT)</strong> method by embedding prompt tokens into the expanded channels. We also find that there are anomalies in the feature space distribution of foundation models during pre-training process, and prompt tuning can help mitigate this negative impact. To explain this phenomenon, we also introduce a novel perspective to understand prompt tuning: <strong>Prompt tuning is a distribution calibrator.</strong> And we support it by analysing patch-wise scaling and feature separation operations contained in EPT. Our experiments show that EPT outperforms several state-of-the-art fine-tuning methods by a significant margin on few-shot medical image classification tasks, and completes the fine-tuning process within highly competitive time, indicating EPT is an effective PEFT method. The source code is available at <span>github.com/zuwenqiang/EPT</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"97 \",\"pages\":\"Article 103258\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S136184152400183X\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136184152400183X","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在大规模数据上预先训练的基础模型在各种自然成像下游任务中取得了成功，这一点已被广泛证实。参数高效微调（PEFT）方法旨在通过只更新一小部分参数，使基础模型适应新的领域，以减少计算开销。然而，这些 PEFT 方法的有效性，尤其是在跨领域少镜头场景（如医学图像分析）中的有效性，尚未得到充分探索。在这项工作中，我们促进了对 PEFT 在将基础模型适应于医学图像分类任务时的性能的研究。此外，为了缓解主流提示调整方法的提示引入方式和近似能力对 Transformer 架构的限制，我们提出了嵌入式提示调整（Embedded Prompt Tuning，EPT）方法，将提示标记嵌入到扩展通道中。我们还发现，在预训练过程中，基础模型的特征空间分布会出现异常，而提示调谐可以帮助减轻这种负面影响。为了解释这一现象，我们还引入了一个新的视角来理解提示调谐：及时调整是一种分布校准器。我们通过分析 EPT 中包含的片段缩放和特征分离操作来支持它。我们的实验表明，EPT 在少镜头医学图像分类任务中的表现明显优于几种最先进的微调方法，并且能在极具竞争力的时间内完成微调过程，这表明 EPT 是一种有效的 PEFT 方法。源代码见 github.com/zuwenqiang/EPT。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Embedded prompt tuning: Towards enhanced calibration of pretrained models for medical images

Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain few-shot scenarios, e.g., medical image analysis, has not been fully explored. In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks. Furthermore, to alleviate the limitations of prompt introducing ways and approximation capabilities on Transformer architectures of mainstream prompt tuning methods, we propose the Embedded Prompt Tuning (EPT) method by embedding prompt tokens into the expanded channels. We also find that there are anomalies in the feature space distribution of foundation models during pre-training process, and prompt tuning can help mitigate this negative impact. To explain this phenomenon, we also introduce a novel perspective to understand prompt tuning: Prompt tuning is a distribution calibrator. And we support it by analysing patch-wise scaling and feature separation operations contained in EPT. Our experiments show that EPT outperforms several state-of-the-art fine-tuning methods by a significant margin on few-shot medical image classification tasks, and completes the fine-tuning process within highly competitive time, indicating EPT is an effective PEFT method. The source code is available at github.com/zuwenqiang/EPT.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.