Anders Traberg Hansen, Johannes Thestrup Askglæde, Jesper Folsted Kallehauge, Anders Schwartz Vittrup, Kim Hochreuter, Slavka Lukacova
{"title":"两种基于神经网络的胶质母细胞瘤划定方法的临床评价。","authors":"Anders Traberg Hansen, Johannes Thestrup Askglæde, Jesper Folsted Kallehauge, Anders Schwartz Vittrup, Kim Hochreuter, Slavka Lukacova","doi":"10.1016/j.tipsro.2025.100330","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and purpose: </strong>Precise gross tumour volume definition is essential for radiotherapy. Neural networks may improve tumour delineation and reduce manual workload. However, clinical evaluation is crucial for understanding their precision and limitations.</p><p><strong>Materials and methods: </strong>Two neural network-based models were evaluated for glioblastoma delineation in 70 clinical cases: one developed by Cercare Medical Inc (CMN) and the publicly available Raidionics model. Delineations were compared using Hausdorff 95% (HD95) distance, Dice similarity coefficient (DSC) and the prevalence of false-positive and false-negative volumes. Additionally, interobserver variability between clinicians and the dosimetric consequences of differences in delineation were assessed.</p><p><strong>Results: </strong>The Raidionics model achieved a mean HD95 of 5.61 mm, with a 5th and 95th percentile range of 2.13-14.8 mm, and a mean DSC of 0.80 [0.62, 0.92]. The CMN model achieved a mean HD95 of 4.24 mm [2.05, 10.2] and mean DSC of 0.83 [0.65, 0.93]. For both metrics the Wilcoxon rank test showed a significant difference (p < 0.002). Both models produced small false-positive volumes, averaging less than 10 % of the true volume. The false-negative volumes averaged around 20 % of the true tumour volume for both models. The HD95 and DSC of interobserver variability were found to be 2.91 mm and 0.89 respectively.</p><p><strong>Conclusion: </strong>The CMN performed significantly better than the Raidionics model. Both models demonstrated a low occurrence of false-positive delineations and acceptable robustness in preserving dose coverage. However, their performance remained inferior to clinical experts. Further model development is recommended before potential clinical implementation.</p>","PeriodicalId":36328,"journal":{"name":"Technical Innovations and Patient Support in Radiation Oncology","volume":"35 ","pages":"100330"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357038/pdf/","citationCount":"0","resultStr":"{\"title\":\"Clinical evaluation of two glioblastoma delineation methods based on neural networks.\",\"authors\":\"Anders Traberg Hansen, Johannes Thestrup Askglæde, Jesper Folsted Kallehauge, Anders Schwartz Vittrup, Kim Hochreuter, Slavka Lukacova\",\"doi\":\"10.1016/j.tipsro.2025.100330\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background and purpose: </strong>Precise gross tumour volume definition is essential for radiotherapy. Neural networks may improve tumour delineation and reduce manual workload. However, clinical evaluation is crucial for understanding their precision and limitations.</p><p><strong>Materials and methods: </strong>Two neural network-based models were evaluated for glioblastoma delineation in 70 clinical cases: one developed by Cercare Medical Inc (CMN) and the publicly available Raidionics model. Delineations were compared using Hausdorff 95% (HD95) distance, Dice similarity coefficient (DSC) and the prevalence of false-positive and false-negative volumes. Additionally, interobserver variability between clinicians and the dosimetric consequences of differences in delineation were assessed.</p><p><strong>Results: </strong>The Raidionics model achieved a mean HD95 of 5.61 mm, with a 5th and 95th percentile range of 2.13-14.8 mm, and a mean DSC of 0.80 [0.62, 0.92]. The CMN model achieved a mean HD95 of 4.24 mm [2.05, 10.2] and mean DSC of 0.83 [0.65, 0.93]. For both metrics the Wilcoxon rank test showed a significant difference (p < 0.002). Both models produced small false-positive volumes, averaging less than 10 % of the true volume. The false-negative volumes averaged around 20 % of the true tumour volume for both models. The HD95 and DSC of interobserver variability were found to be 2.91 mm and 0.89 respectively.</p><p><strong>Conclusion: </strong>The CMN performed significantly better than the Raidionics model. Both models demonstrated a low occurrence of false-positive delineations and acceptable robustness in preserving dose coverage. However, their performance remained inferior to clinical experts. Further model development is recommended before potential clinical implementation.</p>\",\"PeriodicalId\":36328,\"journal\":{\"name\":\"Technical Innovations and Patient Support in Radiation Oncology\",\"volume\":\"35 \",\"pages\":\"100330\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357038/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Technical Innovations and Patient Support in Radiation Oncology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.tipsro.2025.100330\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/9/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"Nursing\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technical Innovations and Patient Support in Radiation Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.tipsro.2025.100330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Nursing","Score":null,"Total":0}
引用次数: 0
摘要
背景和目的:精确的大体肿瘤体积定义对放射治疗至关重要。神经网络可以改善肿瘤的描绘,减少人工工作量。然而,临床评估对于了解其准确性和局限性至关重要。材料和方法:在70例临床病例中评估了两种基于神经网络的胶质母细胞瘤描述模型:一种是由Cercare Medical Inc (CMN)开发的,另一种是公开可用的Raidionics模型。采用Hausdorff 95% (HD95)距离、Dice相似系数(DSC)和假阳性和假阴性体积的发生率比较。此外,还评估了临床医生之间的观察者之间的差异以及划定差异的剂量学后果。结果:Raidionics模型的平均HD95为5.61 mm,第5和第95百分位范围为2.13-14.8 mm,平均DSC为0.80[0.62,0.92]。CMN模型的平均HD95为4.24 mm[2.05, 10.2],平均DSC为0.83[0.65,0.93]。对于这两个指标,Wilcoxon秩检验显示显著差异(p)。结论:CMN的表现明显优于Raidionics模型。两种模型都显示了低假阳性描述的发生率和可接受的保持剂量覆盖的稳健性。然而,他们的表现仍然不如临床专家。在潜在的临床应用之前,建议进一步开发模型。
Clinical evaluation of two glioblastoma delineation methods based on neural networks.
Background and purpose: Precise gross tumour volume definition is essential for radiotherapy. Neural networks may improve tumour delineation and reduce manual workload. However, clinical evaluation is crucial for understanding their precision and limitations.
Materials and methods: Two neural network-based models were evaluated for glioblastoma delineation in 70 clinical cases: one developed by Cercare Medical Inc (CMN) and the publicly available Raidionics model. Delineations were compared using Hausdorff 95% (HD95) distance, Dice similarity coefficient (DSC) and the prevalence of false-positive and false-negative volumes. Additionally, interobserver variability between clinicians and the dosimetric consequences of differences in delineation were assessed.
Results: The Raidionics model achieved a mean HD95 of 5.61 mm, with a 5th and 95th percentile range of 2.13-14.8 mm, and a mean DSC of 0.80 [0.62, 0.92]. The CMN model achieved a mean HD95 of 4.24 mm [2.05, 10.2] and mean DSC of 0.83 [0.65, 0.93]. For both metrics the Wilcoxon rank test showed a significant difference (p < 0.002). Both models produced small false-positive volumes, averaging less than 10 % of the true volume. The false-negative volumes averaged around 20 % of the true tumour volume for both models. The HD95 and DSC of interobserver variability were found to be 2.91 mm and 0.89 respectively.
Conclusion: The CMN performed significantly better than the Raidionics model. Both models demonstrated a low occurrence of false-positive delineations and acceptable robustness in preserving dose coverage. However, their performance remained inferior to clinical experts. Further model development is recommended before potential clinical implementation.