利用卷积神经网络对 18F-PSMA-1007 PET 进行原发性前列腺癌肿瘤自动定界的多中心数据集的影响。

IF 3.3 2区 医学 Q2 ONCOLOGY
Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou
{"title":"利用卷积神经网络对 18F-PSMA-1007 PET 进行原发性前列腺癌肿瘤自动定界的多中心数据集的影响。","authors":"Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou","doi":"10.1186/s13014-024-02491-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.</p><p><strong>Methods: </strong>nnU-Net is trained using a dataset comprising 161 <sup>18</sup>F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.</p><p><strong>Results: </strong>The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).</p><p><strong>Conclusion: </strong>CNNs trained for auto contouring intraprostatic GTV in <sup>18</sup>F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic <sup>18</sup>F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.</p>","PeriodicalId":49639,"journal":{"name":"Radiation Oncology","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304577/pdf/","citationCount":"0","resultStr":"{\"title\":\"The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on <sup>18</sup>F-PSMA-1007 PET.\",\"authors\":\"Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou\",\"doi\":\"10.1186/s13014-024-02491-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.</p><p><strong>Methods: </strong>nnU-Net is trained using a dataset comprising 161 <sup>18</sup>F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.</p><p><strong>Results: </strong>The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).</p><p><strong>Conclusion: </strong>CNNs trained for auto contouring intraprostatic GTV in <sup>18</sup>F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic <sup>18</sup>F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.</p>\",\"PeriodicalId\":49639,\"journal\":{\"name\":\"Radiation Oncology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304577/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiation Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13014-024-02491-w\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiation Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13014-024-02491-w","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:卷积神经网络(CNN)已成为放射肿瘤学领域的变革性工具,大大提高了轮廓修整实践的精确度。然而,这些算法在不同扫描仪、机构和成像协议中的适应性仍然是一个相当大的障碍。本研究旨在探讨将特定机构的数据集纳入 CNN 训练方案的效果,以评估其在真实临床环境中的泛化能力。方法:使用从四个不同机构(弗莱堡:n = 96;慕尼黑:n = 19;塞浦路斯:n = 32;德累斯顿:n = 14)收集的 161 幅 18F-PSMA-1007 PET 图像组成的数据集训练 nnU-Net。对数据集进行了分割,使每个中心的数据都被系统地排除在训练之外,仅用于测试,以评估模型对陌生来源数据的泛化和适应性。通过 5 倍交叉验证对模型的性能进行比较,详细比较了在单中心数据集上训练的模型和在多中心数据集上训练的模型。骰子相似度得分、豪斯多夫距离和容积分析被用作主要评估指标:结果:在五倍交叉验证中,混合训练法得出的 DSC 中位数为 0.76(IQR:0.64-0.84),与排除各中心数据训练的模型相比无显著差异(p = 0.18),后者的 DSC 中位数为 0.74(IQR:0.56-0.86)。德累斯顿队列在多中心训练方面的表现有显著提高(多中心中位 DSC 0.71,IQR:0.58-0.80 vs. 单中心 0.68,IQR:0.50-0.80,p 0.05;弗莱堡:多中心 0.78,IQR:0.53-0.87 vs. 单中心 0.71,IQR:0.53-0.83,p = 0.23):在来自多个中心的不同数据集上训练的 CNN 可自动勾画 18F-PSMA-1007 PET 中的前列腺内 GTV,这些 CNN 大多能很好地概括来自其他中心的未见数据。与仅使用单中心数据集进行静息区内 18F-PSMA-1007 PET GTV 分割训练相比,使用多中心数据集进行训练可提高性能。同一 CNN 的分割性能会因训练和测试所用的数据集而不同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on 18F-PSMA-1007 PET.

Purpose: Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.

Methods: nnU-Net is trained using a dataset comprising 161 18F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.

Results: The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).

Conclusion: CNNs trained for auto contouring intraprostatic GTV in 18F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic 18F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Radiation Oncology
Radiation Oncology ONCOLOGY-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
6.50
自引率
2.80%
发文量
181
审稿时长
3-6 weeks
期刊介绍: Radiation Oncology encompasses all aspects of research that impacts on the treatment of cancer using radiation. It publishes findings in molecular and cellular radiation biology, radiation physics, radiation technology, and clinical oncology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信