Analysis of Swin-UNet vision transformer for Inferior Vena Cava filter segmentation from CT scans

Rahul Gomes , Tyler Pham , Nichol He , Connor Kamrowski , Joseph Wildenberg
{"title":"Analysis of Swin-UNet vision transformer for Inferior Vena Cava filter segmentation from CT scans","authors":"Rahul Gomes ,&nbsp;Tyler Pham ,&nbsp;Nichol He ,&nbsp;Connor Kamrowski ,&nbsp;Joseph Wildenberg","doi":"10.1016/j.ailsci.2023.100084","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>The purpose of this study is to develop an accurate deep learning model capable of Inferior Vena Cava (IVC) filter segmentation from CT scans. The study does a comparative assessment of the impact of Residual Networks (ResNets) complemented with reduced convolutional layer depth and also analyzes the impact of using vision transformer architectures without performance degradation.</p></div><div><h3>Materials and Methods</h3><p>This experimental retrospective study on 84 CT scans consisting of 54618 slices involves design, implementation, and evaluation of segmentation algorithm which can be used to generate a clinical report for the presence of IVC filters on abdominal CT scans performed for any reason. Several variants of patch-based 3D-Convolutional Neural Network (CNN) and the Swin UNet Transformer (Swin-UNETR) are used to retrieve the signature of IVC filters. The Dice Score is used as a metric to compare the performance of the segmentation models.</p></div><div><h3>Results</h3><p>Model trained on UNet variant using four ResNet layers showed a higher segmentation performance achieving median Dice = 0.92 [Interquartile range(IQR): 0.85, 0.93] compared to the plain UNet model with four layers having median Dice = 0.89 [IQR: 0.83, 0.92]. Segmentation results from ResNet with two layers achieved a median Dice = 0.93 [IQR: 0.87, 0.94] which was higher than the plain UNet model with two layers at median Dice = 0.87 [IQR: 0.77, 0.90]. Models trained using SWIN-based transformers performed significantly better in both training and validation datasets compared to the four CNN variants. The validation median Dice was highest in 4 layer Swin UNETR at 0.88 followed by 2 layer Swin UNETR at 0.85.</p></div><div><h3>Conclusion</h3><p>Utilization of vision based transformer Swin-UNETR results in segmentation output with both low bias and variance thereby solving a real-world problem within healthcare for advanced Artificial Intelligence (AI) image processing and recognition. The Swin UNETR will reduce the time spent manually tracking IVC filters by centralizing within the electronic health record. Link to <span>GitHub</span><svg><path></path></svg> repository.</p></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318523000284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

The purpose of this study is to develop an accurate deep learning model capable of Inferior Vena Cava (IVC) filter segmentation from CT scans. The study does a comparative assessment of the impact of Residual Networks (ResNets) complemented with reduced convolutional layer depth and also analyzes the impact of using vision transformer architectures without performance degradation.

Materials and Methods

This experimental retrospective study on 84 CT scans consisting of 54618 slices involves design, implementation, and evaluation of segmentation algorithm which can be used to generate a clinical report for the presence of IVC filters on abdominal CT scans performed for any reason. Several variants of patch-based 3D-Convolutional Neural Network (CNN) and the Swin UNet Transformer (Swin-UNETR) are used to retrieve the signature of IVC filters. The Dice Score is used as a metric to compare the performance of the segmentation models.

Results

Model trained on UNet variant using four ResNet layers showed a higher segmentation performance achieving median Dice = 0.92 [Interquartile range(IQR): 0.85, 0.93] compared to the plain UNet model with four layers having median Dice = 0.89 [IQR: 0.83, 0.92]. Segmentation results from ResNet with two layers achieved a median Dice = 0.93 [IQR: 0.87, 0.94] which was higher than the plain UNet model with two layers at median Dice = 0.87 [IQR: 0.77, 0.90]. Models trained using SWIN-based transformers performed significantly better in both training and validation datasets compared to the four CNN variants. The validation median Dice was highest in 4 layer Swin UNETR at 0.88 followed by 2 layer Swin UNETR at 0.85.

Conclusion

Utilization of vision based transformer Swin-UNETR results in segmentation output with both low bias and variance thereby solving a real-world problem within healthcare for advanced Artificial Intelligence (AI) image processing and recognition. The Swin UNETR will reduce the time spent manually tracking IVC filters by centralizing within the electronic health record. Link to GitHub repository.

Abstract Image

Swin-UNet视觉变换器用于下腔静脉CT滤波分割的分析
目的建立一种精确的深度学习模型,用于下腔静脉(IVC) CT图像的滤波分割。该研究对残差网络(ResNets)与减少卷积层深度相结合的影响进行了比较评估,并分析了在不降低性能的情况下使用视觉转换器架构的影响。材料和方法本实验回顾性研究了84个CT扫描,包括54618个切片,涉及分割算法的设计、实现和评估,该算法可用于生成临床报告,用于任何原因进行的腹部CT扫描中存在IVC过滤器。基于补丁的三维卷积神经网络(CNN)和Swin UNet变压器(swan - unetr)的几种变体被用于检索IVC滤波器的特征。Dice Score被用作比较分割模型性能的指标。结果使用4个ResNet层训练的UNet变体模型与使用4个ResNet层训练的UNet模型相比,具有更高的分割性能,达到中位数Dice = 0.92[四分位间距(IQR): 0.85, 0.93],而普通UNet模型的中位数Dice = 0.89 [IQR: 0.83, 0.92]。ResNet两层分割结果的中位数Dice = 0.93 [IQR: 0.87, 0.94],高于普通UNet两层模型的中位数Dice = 0.87 [IQR: 0.77, 0.90]。与四种CNN变体相比,使用基于swn的变压器训练的模型在训练和验证数据集中的表现都要好得多。4层Swin UNETR的验证中位数骰子最高,为0.88,其次是2层Swin UNETR,为0.85。结论使用基于视觉的swun - unetr变压器可以获得低偏差和方差的分割输出,从而解决了先进人工智能(AI)图像处理和识别在医疗保健中的现实问题。Swin UNETR将通过集中在电子健康记录内减少人工跟踪IVC过滤器所花费的时间。链接到GitHub仓库。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial intelligence in the life sciences
Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)
CiteScore
5.00
自引率
0.00%
发文量
0
审稿时长
15 days
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信