Optimizing Performance of Transformer-based Models for Fetal Brain MR Image Segmentation.

IF 13.2 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Radiology-Artificial Intelligence Pub Date : 2024-11-01 DOI:10.1148/ryai.230229

Nicolò Pecco, Pasquale Anthony Della Rosa, Matteo Canini, Gianluca Nocera, Paola Scifo, Paolo Ivo Cavoretto, Massimo Candiani, Andrea Falini, Antonella Castellano, Cristina Baldoli

{"title":"Optimizing Performance of Transformer-based Models for Fetal Brain MR Image Segmentation.","authors":"Nicolò Pecco, Pasquale Anthony Della Rosa, Matteo Canini, Gianluca Nocera, Paola Scifo, Paolo Ivo Cavoretto, Massimo Candiani, Andrea Falini, Antonella Castellano, Cristina Baldoli","doi":"10.1148/ryai.230229","DOIUrl":null,"url":null,"abstract":"Purpose To test the performance of a transformer-based model when manipulating pretraining weights, dataset size, and input size and comparing the best model with the reference standard and state-of-the-art models for a resting-state functional (rs-fMRI) fetal brain extraction task. Materials and Methods An internal retrospective dataset (172 fetuses, 519 images; collected 2018-2022) was used to investigate influence of dataset size, pretraining approaches, and image input size on Swin-U-Net transformer (UNETR) and UNETR models. The internal and external (131 fetuses, 561 images) datasets were used to cross-validate and to assess generalization capability of the best model versus state-of-the-art models on different scanner types and number of gestational weeks (GWs). The Dice similarity coefficient (DSC) and the balanced average Hausdorff distance (BAHD) were used as segmentation performance metrics. Generalized equation estimation multifactorial models were used to assess significant model and interaction effects of interest. Results The Swin-UNETR model was not affected by the pretraining approach and dataset size and performed best with the mean dataset image size, with a mean DSC of 0.92 and BAHD of 0.097. Swin-UNETR was not affected by scanner type. Generalization results on the internal dataset showed that Swin-UNETR had lower performance compared with the reference standard models and comparable performance on the external dataset. Cross-validation on internal and external test sets demonstrated better and comparable performance of Swin-UNETR versus convolutional neural network architectures during the late-fetal period (GWs > 25) but lower performance during the midfetal period (GWs ≤ 25). Conclusion Swin-UNTER showed flexibility in dealing with smaller datasets, regardless of pretraining approaches. For fetal brain extraction from rs-fMR images, Swin-UNTER showed comparable performance with that of reference standard models during the late-fetal period and lower performance during the early GW period. Keywords: Transformers, CNN, Medical Imaging Segmentation, MRI, Dataset Size, Input Size, Transfer Learning Supplemental material is available for this article. © RSNA, 2024.","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230229"},"PeriodicalIF":13.2000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11605146/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose To test the performance of a transformer-based model when manipulating pretraining weights, dataset size, and input size and comparing the best model with the reference standard and state-of-the-art models for a resting-state functional (rs-fMRI) fetal brain extraction task. Materials and Methods An internal retrospective dataset (172 fetuses, 519 images; collected 2018-2022) was used to investigate influence of dataset size, pretraining approaches, and image input size on Swin-U-Net transformer (UNETR) and UNETR models. The internal and external (131 fetuses, 561 images) datasets were used to cross-validate and to assess generalization capability of the best model versus state-of-the-art models on different scanner types and number of gestational weeks (GWs). The Dice similarity coefficient (DSC) and the balanced average Hausdorff distance (BAHD) were used as segmentation performance metrics. Generalized equation estimation multifactorial models were used to assess significant model and interaction effects of interest. Results The Swin-UNETR model was not affected by the pretraining approach and dataset size and performed best with the mean dataset image size, with a mean DSC of 0.92 and BAHD of 0.097. Swin-UNETR was not affected by scanner type. Generalization results on the internal dataset showed that Swin-UNETR had lower performance compared with the reference standard models and comparable performance on the external dataset. Cross-validation on internal and external test sets demonstrated better and comparable performance of Swin-UNETR versus convolutional neural network architectures during the late-fetal period (GWs > 25) but lower performance during the midfetal period (GWs ≤ 25). Conclusion Swin-UNTER showed flexibility in dealing with smaller datasets, regardless of pretraining approaches. For fetal brain extraction from rs-fMR images, Swin-UNTER showed comparable performance with that of reference standard models during the late-fetal period and lower performance during the early GW period. Keywords: Transformers, CNN, Medical Imaging Segmentation, MRI, Dataset Size, Input Size, Transfer Learning Supplemental material is available for this article. © RSNA, 2024.

查看原文本刊更多论文

优化基于变压器模型的胎儿脑磁共振图像分割性能

"刚刚接受 "的论文经过同行评审，已被接受在《放射学》上发表：人工智能》上发表。这篇文章在以最终版本发表之前，还将经过校对、排版和校对审核。请注意，在制作最终校对稿的过程中，可能会发现影响内容的错误。目的测试基于转换器的模型在处理预训练权重、数据集大小、输入大小时的性能，并将最佳模型与参考标准模型和最先进模型进行比较，用于静息态功能（rs-fMRI）胎儿大脑提取任务。材料与方法使用内部回顾性数据集（胎儿 = 172；图像 = 519；收集时间为 2018-2022 年）研究数据集大小、预训练方法和图像输入大小对 Swin-UNETR 和 UNETR 模型的影响。内部数据集和外部数据集（胎儿 = 131；图像 = 561）用于交叉验证和评估最佳模型在不同扫描仪类型和孕周数（GW）上与最先进模型的泛化能力。狄斯相似系数（DSC）和平衡平均豪斯多夫距离（BAHD）被用作分割性能指标。使用 GEE 多因素模型来评估感兴趣的重要模型和交互效应。结果 Swin-UNETR 不受预训练方法和数据集大小的影响，在使用平均数据集图像大小时表现最佳，平均 DSC 为 0.92，BAHD 为 0.097。Swin-UNETR 不受扫描仪类型的影响。内部数据集的泛化结果表明，与参考标准模型相比，Swin-UNETR 的性能较低，而在外部数据集上的性能相当。内部和外部测试集的交叉验证结果表明，在胎儿晚期（GWs > 25），Swin-UNETR 与卷积神经网络架构的性能更好，两者性能相当，但在胎儿中期（GWs ≤ 25），Swin-UNETR 的性能较低。结论无论采用哪种预训练方法，Swin-UNTER 在处理较小的数据集时都表现出了灵活性。对于 rs-fMRI 的胎儿大脑提取，Swin-UNTER 在胎儿晚期表现出与参考标准模型相当的性能，而在 GW 早期表现较差。©RSNA，2024。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Radiology-Artificial Intelligence

CiteScore

16.20

自引率

1.00%

发文量

期刊介绍： Radiology: Artificial Intelligence is a bi-monthly publication that focuses on the emerging applications of machine learning and artificial intelligence in the field of imaging across various disciplines. This journal is available online and accepts multiple manuscript types, including Original Research, Technical Developments, Data Resources, Review articles, Editorials, Letters to the Editor and Replies, Special Reports, and AI in Brief.