Anders Traberg Hansen, Johannes Thestrup Askglæde, Jesper Folsted Kallehauge, Anders Schwartz Vittrup, Kim Hochreuter, Slavka Lukacova
{"title":"Clinical evaluation of two glioblastoma delineation methods based on neural networks.","authors":"Anders Traberg Hansen, Johannes Thestrup Askglæde, Jesper Folsted Kallehauge, Anders Schwartz Vittrup, Kim Hochreuter, Slavka Lukacova","doi":"10.1016/j.tipsro.2025.100330","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and purpose: </strong>Precise gross tumour volume definition is essential for radiotherapy. Neural networks may improve tumour delineation and reduce manual workload. However, clinical evaluation is crucial for understanding their precision and limitations.</p><p><strong>Materials and methods: </strong>Two neural network-based models were evaluated for glioblastoma delineation in 70 clinical cases: one developed by Cercare Medical Inc (CMN) and the publicly available Raidionics model. Delineations were compared using Hausdorff 95% (HD95) distance, Dice similarity coefficient (DSC) and the prevalence of false-positive and false-negative volumes. Additionally, interobserver variability between clinicians and the dosimetric consequences of differences in delineation were assessed.</p><p><strong>Results: </strong>The Raidionics model achieved a mean HD95 of 5.61 mm, with a 5th and 95th percentile range of 2.13-14.8 mm, and a mean DSC of 0.80 [0.62, 0.92]. The CMN model achieved a mean HD95 of 4.24 mm [2.05, 10.2] and mean DSC of 0.83 [0.65, 0.93]. For both metrics the Wilcoxon rank test showed a significant difference (p < 0.002). Both models produced small false-positive volumes, averaging less than 10 % of the true volume. The false-negative volumes averaged around 20 % of the true tumour volume for both models. The HD95 and DSC of interobserver variability were found to be 2.91 mm and 0.89 respectively.</p><p><strong>Conclusion: </strong>The CMN performed significantly better than the Raidionics model. Both models demonstrated a low occurrence of false-positive delineations and acceptable robustness in preserving dose coverage. However, their performance remained inferior to clinical experts. Further model development is recommended before potential clinical implementation.</p>","PeriodicalId":36328,"journal":{"name":"Technical Innovations and Patient Support in Radiation Oncology","volume":"35 ","pages":"100330"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357038/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technical Innovations and Patient Support in Radiation Oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.tipsro.2025.100330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Nursing","Score":null,"Total":0}
引用次数: 0
Abstract
Background and purpose: Precise gross tumour volume definition is essential for radiotherapy. Neural networks may improve tumour delineation and reduce manual workload. However, clinical evaluation is crucial for understanding their precision and limitations.
Materials and methods: Two neural network-based models were evaluated for glioblastoma delineation in 70 clinical cases: one developed by Cercare Medical Inc (CMN) and the publicly available Raidionics model. Delineations were compared using Hausdorff 95% (HD95) distance, Dice similarity coefficient (DSC) and the prevalence of false-positive and false-negative volumes. Additionally, interobserver variability between clinicians and the dosimetric consequences of differences in delineation were assessed.
Results: The Raidionics model achieved a mean HD95 of 5.61 mm, with a 5th and 95th percentile range of 2.13-14.8 mm, and a mean DSC of 0.80 [0.62, 0.92]. The CMN model achieved a mean HD95 of 4.24 mm [2.05, 10.2] and mean DSC of 0.83 [0.65, 0.93]. For both metrics the Wilcoxon rank test showed a significant difference (p < 0.002). Both models produced small false-positive volumes, averaging less than 10 % of the true volume. The false-negative volumes averaged around 20 % of the true tumour volume for both models. The HD95 and DSC of interobserver variability were found to be 2.91 mm and 0.89 respectively.
Conclusion: The CMN performed significantly better than the Raidionics model. Both models demonstrated a low occurrence of false-positive delineations and acceptable robustness in preserving dose coverage. However, their performance remained inferior to clinical experts. Further model development is recommended before potential clinical implementation.