PASS: Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical Image Segmentation

IEEE transactions on medical imaging Pub Date : 2024-12-23 DOI:10.1109/TMI.2024.3521463

Chuyan Zhang;Hao Zheng;Xin You;Yefeng Zheng;Yun Gu

{"title":"PASS: Test-Time Prompting to Adapt Styles and Semantic Shapes in Medical Image Segmentation","authors":"Chuyan Zhang;Hao Zheng;Xin You;Yefeng Zheng;Yun Gu","doi":"10.1109/TMI.2024.3521463","DOIUrl":null,"url":null,"abstract":"Test-time adaptation (TTA) has emerged as a promising paradigm to handle the domain shifts at test time for medical images from different institutions without using extra training data. However, existing TTA solutions for segmentation tasks suffer from 1) dependency on modifying the source training stage and access to source priors or 2) lack of emphasis on shape-related semantic knowledge that is crucial for segmentation tasks. Recent research on visual prompt learning achieves source-relaxed adaptation by extended parameter space but still neglects the full utilization of semantic features, thus motivating our work on knowledge-enriched deep prompt learning. Beyond the general concern of image style shifts, we reveal that shape variability is another crucial factor causing the performance drop. To address this issue, we propose a TTA framework called PASS (Prompting to Adapt Styles and Semantic shapes), which jointly learns two types of prompts: the input-space prompt to reformulate the style of the test image to fit into the pretrained model and the semantic-aware prompts to bridge high-level shape discrepancy across domains. Instead of naively imposing a fixed prompt, we introduce an input decorator to generate the self-regulating visual prompt conditioned on the input data. To retrieve the knowledge representations and customize target-specific shape prompts for each test sample, we propose a cross-attention prompt modulator, which performs interaction between target representations and an enriched shape prompt bank. Extensive experiments demonstrate the superior performance of PASS over state-of-the-art methods on multiple medical image segmentation datasets. The code is available at <uri>https://github.com/EndoluminalSurgicalVision-IMR/PASS</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 4","pages":"1853-1865"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10812757/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Test-time adaptation (TTA) has emerged as a promising paradigm to handle the domain shifts at test time for medical images from different institutions without using extra training data. However, existing TTA solutions for segmentation tasks suffer from 1) dependency on modifying the source training stage and access to source priors or 2) lack of emphasis on shape-related semantic knowledge that is crucial for segmentation tasks. Recent research on visual prompt learning achieves source-relaxed adaptation by extended parameter space but still neglects the full utilization of semantic features, thus motivating our work on knowledge-enriched deep prompt learning. Beyond the general concern of image style shifts, we reveal that shape variability is another crucial factor causing the performance drop. To address this issue, we propose a TTA framework called PASS (Prompting to Adapt Styles and Semantic shapes), which jointly learns two types of prompts: the input-space prompt to reformulate the style of the test image to fit into the pretrained model and the semantic-aware prompts to bridge high-level shape discrepancy across domains. Instead of naively imposing a fixed prompt, we introduce an input decorator to generate the self-regulating visual prompt conditioned on the input data. To retrieve the knowledge representations and customize target-specific shape prompts for each test sample, we propose a cross-attention prompt modulator, which performs interaction between target representations and an enriched shape prompt bank. Extensive experiments demonstrate the superior performance of PASS over state-of-the-art methods on multiple medical image segmentation datasets. The code is available at https://github.com/EndoluminalSurgicalVision-IMR/PASS.

查看原文本刊更多论文

医学图像分割中适应样式和语义形状的测试时间提示

在不使用额外训练数据的情况下，测试时间适应（TTA）已经成为一种很有前途的范例，可以处理来自不同机构的医学图像在测试时间的域转移。然而，现有的切分任务的TTA解决方案存在以下问题：1)依赖于修改源训练阶段和获取源先验；2)缺乏对切分任务至关重要的形状相关语义知识的重视。目前对视觉提示学习的研究通过扩展参数空间实现了源放松自适应，但仍然忽略了对语义特征的充分利用，从而激励了我们对丰富知识的深度提示学习的研究。除了对图像样式变化的普遍关注之外，我们发现形状可变性是导致性能下降的另一个关键因素。为了解决这个问题，我们提出了一个名为PASS（提示以适应样式和语义形状）的TTA框架，它联合学习两种类型的提示：输入空间提示以重新制定测试图像的样式以适应预训练模型，语义感知提示以弥合跨域的高级形状差异。我们没有天真地强加一个固定的提示，而是引入了一个输入装饰器来根据输入数据生成自我调节的视觉提示。为了检索知识表示并为每个测试样本定制特定于目标的形状提示，我们提出了一个交叉注意提示调制器，该调制器在目标表示和丰富的形状提示库之间进行交互。大量的实验表明，在多个医学图像分割数据集上，PASS优于最先进的方法。代码可在https://github.com/EndoluminalSurgicalVision-IMR/PASS上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量