TTT-Unet：利用测试时间训练层增强 U-Net 以进行生物医学图像分割

arXiv - EE - Image and Video Processing Pub Date : 2024-09-17 DOI:arxiv-2409.11299

Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun

{"title":"TTT-Unet：利用测试时间训练层增强 U-Net 以进行生物医学图像分割","authors":"Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun","doi":"arxiv-2409.11299","DOIUrl":null,"url":null,"abstract":"Biomedical image segmentation is crucial for accurately diagnosing and\nanalyzing various diseases. However, Convolutional Neural Networks (CNNs) and\nTransformers, the most commonly used architectures for this task, struggle to\neffectively capture long-range dependencies due to the inherent locality of\nCNNs and the computational complexity of Transformers. To address this\nlimitation, we introduce TTT-Unet, a novel framework that integrates Test-Time\nTraining (TTT) layers into the traditional U-Net architecture for biomedical\nimage segmentation. TTT-Unet dynamically adjusts model parameters during the\ntesting time, enhancing the model's ability to capture both local and\nlong-range features. We evaluate TTT-Unet on multiple medical imaging datasets,\nincluding 3D abdominal organ segmentation in CT and MR images, instrument\nsegmentation in endoscopy images, and cell segmentation in microscopy images.\nThe results demonstrate that TTT-Unet consistently outperforms state-of-the-art\nCNN-based and Transformer-based segmentation models across all tasks. The code\nis available at https://github.com/rongzhou7/TTT-Unet.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TTT-Unet: Enhancing U-Net with Test-Time Training Layers for biomedical image segmentation\",\"authors\":\"Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun\",\"doi\":\"arxiv-2409.11299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biomedical image segmentation is crucial for accurately diagnosing and\\nanalyzing various diseases. However, Convolutional Neural Networks (CNNs) and\\nTransformers, the most commonly used architectures for this task, struggle to\\neffectively capture long-range dependencies due to the inherent locality of\\nCNNs and the computational complexity of Transformers. To address this\\nlimitation, we introduce TTT-Unet, a novel framework that integrates Test-Time\\nTraining (TTT) layers into the traditional U-Net architecture for biomedical\\nimage segmentation. TTT-Unet dynamically adjusts model parameters during the\\ntesting time, enhancing the model's ability to capture both local and\\nlong-range features. We evaluate TTT-Unet on multiple medical imaging datasets,\\nincluding 3D abdominal organ segmentation in CT and MR images, instrument\\nsegmentation in endoscopy images, and cell segmentation in microscopy images.\\nThe results demonstrate that TTT-Unet consistently outperforms state-of-the-art\\nCNN-based and Transformer-based segmentation models across all tasks. The code\\nis available at https://github.com/rongzhou7/TTT-Unet.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

生物医学图像分割对于准确诊断和分析各种疾病至关重要。然而，由于卷积神经网络（CNN）固有的局部性和变换器的计算复杂性，这一任务中最常用的架构--卷积神经网络（CNN）和变换器--难以有效捕捉长距离依赖关系。为了解决这一限制，我们引入了 TTT-Unet，这是一种新型框架，它将测试-时间-训练（TTT）层集成到传统的 U-Net 架构中，用于生物医学图像分割。TTT-Unet 可在测试期间动态调整模型参数，从而增强模型捕捉局部和长距离特征的能力。我们在多个医学影像数据集上对 TTT-Unet 进行了评估，包括 CT 和 MR 图像中的三维腹部器官分割、内窥镜图像中的器械分割以及显微镜图像中的细胞分割。结果表明，在所有任务中，TTT-Unet 的表现始终优于基于 CNN 和 Transformer 的先进分割模型。代码见 https://github.com/rongzhou7/TTT-Unet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TTT-Unet: Enhancing U-Net with Test-Time Training Layers for biomedical image segmentation

Biomedical image segmentation is crucial for accurately diagnosing and analyzing various diseases. However, Convolutional Neural Networks (CNNs) and Transformers, the most commonly used architectures for this task, struggle to effectively capture long-range dependencies due to the inherent locality of CNNs and the computational complexity of Transformers. To address this limitation, we introduce TTT-Unet, a novel framework that integrates Test-Time Training (TTT) layers into the traditional U-Net architecture for biomedical image segmentation. TTT-Unet dynamically adjusts model parameters during the testing time, enhancing the model's ability to capture both local and long-range features. We evaluate TTT-Unet on multiple medical imaging datasets, including 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images. The results demonstrate that TTT-Unet consistently outperforms state-of-the-art CNN-based and Transformer-based segmentation models across all tasks. The code is available at https://github.com/rongzhou7/TTT-Unet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量