Automatic computed tomography image segmentation method for liver tumor based on a modified tokenized multilayer perceptron and attention mechanism.

IF 2.9 2区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Quantitative Imaging in Medicine and Surgery Pub Date : 2025-03-03 Epub Date: 2025-02-26 DOI:10.21037/qims-24-2132

Bo Yang, Jie Zhang, Youlong Lyu, Jun Zhang

{"title":"Automatic computed tomography image segmentation method for liver tumor based on a modified tokenized multilayer perceptron and attention mechanism.","authors":"Bo Yang, Jie Zhang, Youlong Lyu, Jun Zhang","doi":"10.21037/qims-24-2132","DOIUrl":null,"url":null,"abstract":"Background: The automatic medical image segmentation of liver and tumor plays a pivotal role in the clinical diagnosis of liver diseases. A number of effective methods based on deep neural networks, including convolutional neural networks (CNNs) and vision transformer (ViT) have been developed. However, these networks primarily focus on enhancing segmentation accuracy while often overlooking the segmentation speed, which is vital for rapid diagnosis in clinical settings. Therefore, we aimed to develop an automatic computed tomography (CT) image segmentation method for liver tumors that reduces inference time while maintaining accuracy, as rigorously validated through experimental studies.Methods: We developed a U-shaped network enhanced by a multiscale attention module and attention gates, aimed at efficient CT image segmentation of liver tumors. In this network, a modified tokenized multilayer perceptron (MLP) block is first leveraged to reduce the feature dimensions and facilitate information interaction between adjacent patches so that the network can learn the key features of tumors with less computational complexity. Second, attention gates are added into the skip connections between the encoder and decoder, emphasizing feature expression in relevant regions and enabling the network to focus more on liver tumor features. Finally, a multiscale attention mechanism autonomously adjusts weights for each scale, allowing the network to adapt effectively to varying sizes of liver tumors. Our methodology was validated via the Liver Tumor Segmentation 2017 (LiTS17) public dataset. The data from this database are from seven global clinical sites. All data are anonymized, and the images have been prescreened to ensure the absence of personal identifiers. Standard metrics were used to evaluate the performance of the model.Results: The 21 cases were included for testing. The proposed network attained a Dice score of 0.713 [95% confidence interval (CI): 0.592-0.834], a volumetric overlap error of 0.39 (95% CI: 0.17-0.61), a relative volume difference score of 0.19 (95% CI: -0.37 to 0.31), an average symmetric surface distance of 2.04 mm (95% CI: 0.89-4.19), a maximum surface distance of 9.42 mm (95% CI: 6.97-19.87), and an inference time of 26 ms on average for liver tumor segmentation.Conclusions: The proposed network demonstrated efficient liver tumor segmentation performance with less inference time. Our findings contribute to the application of neural networks in rapid clinical diagnosis and treatment.","PeriodicalId":54267,"journal":{"name":"Quantitative Imaging in Medicine and Surgery","volume":"15 3","pages":"2385-2404"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11948385/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Imaging in Medicine and Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/qims-24-2132","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/26 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The automatic medical image segmentation of liver and tumor plays a pivotal role in the clinical diagnosis of liver diseases. A number of effective methods based on deep neural networks, including convolutional neural networks (CNNs) and vision transformer (ViT) have been developed. However, these networks primarily focus on enhancing segmentation accuracy while often overlooking the segmentation speed, which is vital for rapid diagnosis in clinical settings. Therefore, we aimed to develop an automatic computed tomography (CT) image segmentation method for liver tumors that reduces inference time while maintaining accuracy, as rigorously validated through experimental studies.

Methods: We developed a U-shaped network enhanced by a multiscale attention module and attention gates, aimed at efficient CT image segmentation of liver tumors. In this network, a modified tokenized multilayer perceptron (MLP) block is first leveraged to reduce the feature dimensions and facilitate information interaction between adjacent patches so that the network can learn the key features of tumors with less computational complexity. Second, attention gates are added into the skip connections between the encoder and decoder, emphasizing feature expression in relevant regions and enabling the network to focus more on liver tumor features. Finally, a multiscale attention mechanism autonomously adjusts weights for each scale, allowing the network to adapt effectively to varying sizes of liver tumors. Our methodology was validated via the Liver Tumor Segmentation 2017 (LiTS17) public dataset. The data from this database are from seven global clinical sites. All data are anonymized, and the images have been prescreened to ensure the absence of personal identifiers. Standard metrics were used to evaluate the performance of the model.

Results: The 21 cases were included for testing. The proposed network attained a Dice score of 0.713 [95% confidence interval (CI): 0.592-0.834], a volumetric overlap error of 0.39 (95% CI: 0.17-0.61), a relative volume difference score of 0.19 (95% CI: -0.37 to 0.31), an average symmetric surface distance of 2.04 mm (95% CI: 0.89-4.19), a maximum surface distance of 9.42 mm (95% CI: 6.97-19.87), and an inference time of 26 ms on average for liver tumor segmentation.

Conclusions: The proposed network demonstrated efficient liver tumor segmentation performance with less inference time. Our findings contribute to the application of neural networks in rapid clinical diagnosis and treatment.

查看原文本刊更多论文

基于改进标记化多层感知器和注意机制的肝脏肿瘤计算机断层图像自动分割方法。

背景：肝脏和肿瘤医学图像的自动分割在肝脏疾病的临床诊断中起着举足轻重的作用。基于深度神经网络，包括卷积神经网络（cnn）和视觉变换（ViT），已经发展出了许多有效的方法。然而，这些网络主要侧重于提高分割精度，而往往忽略了分割速度，这对于临床环境中的快速诊断至关重要。因此，我们的目标是开发一种肝脏肿瘤的自动计算机断层扫描（CT）图像分割方法，减少推理时间，同时保持准确性，并通过实验研究严格验证。方法：采用多尺度注意模块和注意门增强的u型网络，对肝脏肿瘤进行有效的CT图像分割。在该网络中，首先利用改进的标记化多层感知器（MLP）块来降低特征维度，促进相邻块之间的信息交互，从而使网络能够以较小的计算复杂度学习肿瘤的关键特征。其次，在编码器和解码器之间的跳过连接中加入注意门，强调相关区域的特征表达，使网络更加关注肝脏肿瘤特征。最后，一个多尺度注意机制自主调整每个尺度的权重，使网络能够有效地适应不同大小的肝脏肿瘤。我们的方法通过2017年肝脏肿瘤分割（LiTS17）公共数据集进行了验证。该数据库的数据来自全球七个临床站点。所有数据都是匿名的，图像也经过了预先筛选，以确保没有个人标识符。使用标准度量来评估模型的性能。结果：21例纳入检测。该网络的Dice得分为0.713[95%可信区间（CI）： 0.592-0.834]，体积重叠误差为0.39 (95% CI: 0.17-0.61)，相对体积差得分为0.19 (95% CI: -0.37至0.31)，平均对称表面距离为2.04 mm (95% CI: 0.89-4.19)，最大表面距离为9.42 mm (95% CI: 6.97-19.87)，平均推断时间为26 ms，用于肝脏肿瘤分割。结论：该网络具有较好的肝肿瘤分割效果，且推理时间短。我们的发现有助于神经网络在临床快速诊断和治疗中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊