基于深度学习的不同自分割方法在儿童颅脑脊髓照射治疗计划中的权衡

Medical physics Pub Date : 2025-04-01 DOI:10.1002/mp.17782
Alana Thibodeau-Antonacci, Marija Popovic, Ozgur Ates, Chia-Ho Hua, James Schneider, Sonia Skamene, Carolyn Freeman, Shirin Abbasinejad Enger, James Man Git Tsui
{"title":"基于深度学习的不同自分割方法在儿童颅脑脊髓照射治疗计划中的权衡","authors":"Alana Thibodeau-Antonacci, Marija Popovic, Ozgur Ates, Chia-Ho Hua, James Schneider, Sonia Skamene, Carolyn Freeman, Shirin Abbasinejad Enger, James Man Git Tsui","doi":"10.1002/mp.17782","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>As auto-segmentation tools become integral to radiotherapy, more commercial products emerge. However, they may not always suit our needs. One notable example is the use of adult-trained commercial software for the contouring of organs at risk (OARs) of pediatric patients.</p><p><strong>Purpose: </strong>This study aimed to compare three auto-segmentation approaches in the context of pediatric craniospinal irradiation (CSI): commercial, out-of-the-box, and in-house.</p><p><strong>Methods: </strong>CT scans from 142 pediatric patients undergoing CSI were obtained from St. Jude Children's Research Hospital (training: 115; validation: 27). A test dataset comprising 16 CT scans was collected from the McGill University Health Centre. All images underwent manual delineation of 18 OARs. LimbusAI v1.7 served as the commercial product, while nnU-Net was trained for benchmarking. Additionally, a two-step in-house approach was pursued where smaller 3D CT scans containing the OAR of interest were first recovered and then used as input to train organ-specific models. Three variants of the U-Net architecture were explored: a basic U-Net, an attention U-Net, and a 2.5D U-Net. The dice similarity coefficient (DSC) assessed segmentation accuracy, and the DSC trend with age was investigated (Mann-Kendall test). A radiation oncologist determined the clinical acceptability of all contours using a five-point Likert scale.</p><p><strong>Results: </strong>Differences in the contours between the validation and test datasets reflected the distinct institutional standards. The lungs and left kidney displayed an increasing age-related trend of the DSC values with LimbusAI on the validation and test datasets. LimbusAI contours of the esophagus were often truncated distally and mistaken for the trachea for younger patients, resulting in a DSC score of less than 0.5 on both datasets. Additionally, the kidneys frequently exhibited false negatives, leading to mean DSC values that were up to 0.11 lower on the validation set and 0.07 on the test set compared to the other models. Overall, nnU-Net achieved good performance for body organs but exhibited difficulty differentiating the laterality of head structures, resulting in a large variation of DSC values with the standard deviation reaching 0.35 for the lenses. All in-house models generally had similar DSC values when compared against each other and nnU-Net. Inference time on the test data was between 47-55 min on a Central Processing Unit (CPU) for the in-house models, while it was 1h 21m with a V100 Graphics Processing Unit (GPU) for nnU-Net.</p><p><strong>Conclusions: </strong>LimbusAI could not adapt well to pediatric anatomy for the esophagus and the kidneys. When commercial products do not suit the study population, the nnU-Net is a viable option but requires adjustments. In resource-constrained settings, the in-house model provides an alternative. Implementing an automated segmentation tool requires careful monitoring and quality assurance regardless of the approach.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Trade-off of different deep learning-based auto-segmentation approaches for treatment planning of pediatric craniospinal irradiation autocontouring of OARs for pediatric CSI.\",\"authors\":\"Alana Thibodeau-Antonacci, Marija Popovic, Ozgur Ates, Chia-Ho Hua, James Schneider, Sonia Skamene, Carolyn Freeman, Shirin Abbasinejad Enger, James Man Git Tsui\",\"doi\":\"10.1002/mp.17782\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>As auto-segmentation tools become integral to radiotherapy, more commercial products emerge. However, they may not always suit our needs. One notable example is the use of adult-trained commercial software for the contouring of organs at risk (OARs) of pediatric patients.</p><p><strong>Purpose: </strong>This study aimed to compare three auto-segmentation approaches in the context of pediatric craniospinal irradiation (CSI): commercial, out-of-the-box, and in-house.</p><p><strong>Methods: </strong>CT scans from 142 pediatric patients undergoing CSI were obtained from St. Jude Children's Research Hospital (training: 115; validation: 27). A test dataset comprising 16 CT scans was collected from the McGill University Health Centre. All images underwent manual delineation of 18 OARs. LimbusAI v1.7 served as the commercial product, while nnU-Net was trained for benchmarking. Additionally, a two-step in-house approach was pursued where smaller 3D CT scans containing the OAR of interest were first recovered and then used as input to train organ-specific models. Three variants of the U-Net architecture were explored: a basic U-Net, an attention U-Net, and a 2.5D U-Net. The dice similarity coefficient (DSC) assessed segmentation accuracy, and the DSC trend with age was investigated (Mann-Kendall test). A radiation oncologist determined the clinical acceptability of all contours using a five-point Likert scale.</p><p><strong>Results: </strong>Differences in the contours between the validation and test datasets reflected the distinct institutional standards. The lungs and left kidney displayed an increasing age-related trend of the DSC values with LimbusAI on the validation and test datasets. LimbusAI contours of the esophagus were often truncated distally and mistaken for the trachea for younger patients, resulting in a DSC score of less than 0.5 on both datasets. Additionally, the kidneys frequently exhibited false negatives, leading to mean DSC values that were up to 0.11 lower on the validation set and 0.07 on the test set compared to the other models. Overall, nnU-Net achieved good performance for body organs but exhibited difficulty differentiating the laterality of head structures, resulting in a large variation of DSC values with the standard deviation reaching 0.35 for the lenses. All in-house models generally had similar DSC values when compared against each other and nnU-Net. Inference time on the test data was between 47-55 min on a Central Processing Unit (CPU) for the in-house models, while it was 1h 21m with a V100 Graphics Processing Unit (GPU) for nnU-Net.</p><p><strong>Conclusions: </strong>LimbusAI could not adapt well to pediatric anatomy for the esophagus and the kidneys. When commercial products do not suit the study population, the nnU-Net is a viable option but requires adjustments. In resource-constrained settings, the in-house model provides an alternative. Implementing an automated segmentation tool requires careful monitoring and quality assurance regardless of the approach.</p>\",\"PeriodicalId\":94136,\"journal\":{\"name\":\"Medical physics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/mp.17782\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17782","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:随着自动分割工具成为放射治疗不可或缺的一部分,越来越多的商业化产品出现。然而,它们可能并不总是适合我们的需要。一个值得注意的例子是使用成人训练的商业软件来勾画儿科患者的危险器官(OARs)。目的:本研究旨在比较三种自动分割方法在儿童颅脊髓照射(CSI)的背景下:商业,开箱,和内部。方法:从St. Jude儿童研究医院获得142例接受CSI的儿童患者的CT扫描(培训:115例;验证:27)。从麦吉尔大学健康中心收集了包含16个CT扫描的测试数据集。所有图像均经过人工圈定18个桨。LimbusAI v1.7作为商用产品,而nnU-Net则进行基准测试训练。此外,采用了两步的内部方法,首先恢复包含感兴趣的桨叶的较小的3D CT扫描,然后将其用作训练器官特定模型的输入。研究人员探索了U-Net架构的三种变体:基本U-Net、注意力U-Net和2.5D U-Net。用骰子相似系数(DSC)评价分割精度,并利用Mann-Kendall检验研究DSC随年龄的变化趋势。放射肿瘤学家使用五点李克特量表确定所有轮廓的临床可接受性。结果:验证数据集和测试数据集之间的轮廓差异反映了不同的制度标准。在验证和测试数据集上,使用LimbusAI的肺和左肾的DSC值显示出与年龄相关的增加趋势。对于年轻患者,食管的LimbusAI轮廓经常在远端截短,并被误认为是气管,导致两个数据集的DSC评分低于0.5。此外,与其他模型相比,肾脏经常出现假阴性,导致验证集的平均DSC值降低0.11,测试集的平均DSC值降低0.07。总的来说,nnU-Net在身体器官方面取得了良好的表现,但在区分头部结构的偏侧性方面表现出困难,导致透镜的DSC值变化很大,标准差达到0.35。所有内部模型在相互比较和与nnU-Net比较时通常具有相似的DSC值。对于内部模型的中央处理单元(CPU),测试数据的推断时间在47-55分钟之间,而对于nnU-Net,使用V100图形处理单元(GPU)则为1小时21分钟。结论:LimbusAI不能很好地适应儿童食道和肾脏解剖。当商业产品不适合研究人群时,nnU-Net是一个可行的选择,但需要调整。在资源受限的情况下,内部模型提供了另一种选择。无论采用何种方法,实现自动分割工具都需要仔细监控和质量保证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Trade-off of different deep learning-based auto-segmentation approaches for treatment planning of pediatric craniospinal irradiation autocontouring of OARs for pediatric CSI.

Background: As auto-segmentation tools become integral to radiotherapy, more commercial products emerge. However, they may not always suit our needs. One notable example is the use of adult-trained commercial software for the contouring of organs at risk (OARs) of pediatric patients.

Purpose: This study aimed to compare three auto-segmentation approaches in the context of pediatric craniospinal irradiation (CSI): commercial, out-of-the-box, and in-house.

Methods: CT scans from 142 pediatric patients undergoing CSI were obtained from St. Jude Children's Research Hospital (training: 115; validation: 27). A test dataset comprising 16 CT scans was collected from the McGill University Health Centre. All images underwent manual delineation of 18 OARs. LimbusAI v1.7 served as the commercial product, while nnU-Net was trained for benchmarking. Additionally, a two-step in-house approach was pursued where smaller 3D CT scans containing the OAR of interest were first recovered and then used as input to train organ-specific models. Three variants of the U-Net architecture were explored: a basic U-Net, an attention U-Net, and a 2.5D U-Net. The dice similarity coefficient (DSC) assessed segmentation accuracy, and the DSC trend with age was investigated (Mann-Kendall test). A radiation oncologist determined the clinical acceptability of all contours using a five-point Likert scale.

Results: Differences in the contours between the validation and test datasets reflected the distinct institutional standards. The lungs and left kidney displayed an increasing age-related trend of the DSC values with LimbusAI on the validation and test datasets. LimbusAI contours of the esophagus were often truncated distally and mistaken for the trachea for younger patients, resulting in a DSC score of less than 0.5 on both datasets. Additionally, the kidneys frequently exhibited false negatives, leading to mean DSC values that were up to 0.11 lower on the validation set and 0.07 on the test set compared to the other models. Overall, nnU-Net achieved good performance for body organs but exhibited difficulty differentiating the laterality of head structures, resulting in a large variation of DSC values with the standard deviation reaching 0.35 for the lenses. All in-house models generally had similar DSC values when compared against each other and nnU-Net. Inference time on the test data was between 47-55 min on a Central Processing Unit (CPU) for the in-house models, while it was 1h 21m with a V100 Graphics Processing Unit (GPU) for nnU-Net.

Conclusions: LimbusAI could not adapt well to pediatric anatomy for the esophagus and the kidneys. When commercial products do not suit the study population, the nnU-Net is a viable option but requires adjustments. In resource-constrained settings, the in-house model provides an alternative. Implementing an automated segmentation tool requires careful monitoring and quality assurance regardless of the approach.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信