Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun
{"title":"TTT-Unet: Enhancing U-Net with Test-Time Training Layers for biomedical image segmentation","authors":"Rong Zhou, Zhengqing Yuan, Zhiling Yan, Weixiang Sun, Kai Zhang, Yiwei Li, Yanfang Ye, Xiang Li, Lifang He, Lichao Sun","doi":"arxiv-2409.11299","DOIUrl":null,"url":null,"abstract":"Biomedical image segmentation is crucial for accurately diagnosing and\nanalyzing various diseases. However, Convolutional Neural Networks (CNNs) and\nTransformers, the most commonly used architectures for this task, struggle to\neffectively capture long-range dependencies due to the inherent locality of\nCNNs and the computational complexity of Transformers. To address this\nlimitation, we introduce TTT-Unet, a novel framework that integrates Test-Time\nTraining (TTT) layers into the traditional U-Net architecture for biomedical\nimage segmentation. TTT-Unet dynamically adjusts model parameters during the\ntesting time, enhancing the model's ability to capture both local and\nlong-range features. We evaluate TTT-Unet on multiple medical imaging datasets,\nincluding 3D abdominal organ segmentation in CT and MR images, instrument\nsegmentation in endoscopy images, and cell segmentation in microscopy images.\nThe results demonstrate that TTT-Unet consistently outperforms state-of-the-art\nCNN-based and Transformer-based segmentation models across all tasks. The code\nis available at https://github.com/rongzhou7/TTT-Unet.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biomedical image segmentation is crucial for accurately diagnosing and
analyzing various diseases. However, Convolutional Neural Networks (CNNs) and
Transformers, the most commonly used architectures for this task, struggle to
effectively capture long-range dependencies due to the inherent locality of
CNNs and the computational complexity of Transformers. To address this
limitation, we introduce TTT-Unet, a novel framework that integrates Test-Time
Training (TTT) layers into the traditional U-Net architecture for biomedical
image segmentation. TTT-Unet dynamically adjusts model parameters during the
testing time, enhancing the model's ability to capture both local and
long-range features. We evaluate TTT-Unet on multiple medical imaging datasets,
including 3D abdominal organ segmentation in CT and MR images, instrument
segmentation in endoscopy images, and cell segmentation in microscopy images.
The results demonstrate that TTT-Unet consistently outperforms state-of-the-art
CNN-based and Transformer-based segmentation models across all tasks. The code
is available at https://github.com/rongzhou7/TTT-Unet.