External validation of a multimodality deep-learning normal tissue complication probability model for mandibular osteoradionecrosis trained on 3D radiation distribution maps and clinical variables
Laia Humbert-Vidan , Christian R. Hansen , Vinod Patel , Jørgen Johansen , Andrew P. King , Teresa Guerrero Urbano
{"title":"External validation of a multimodality deep-learning normal tissue complication probability model for mandibular osteoradionecrosis trained on 3D radiation distribution maps and clinical variables","authors":"Laia Humbert-Vidan , Christian R. Hansen , Vinod Patel , Jørgen Johansen , Andrew P. King , Teresa Guerrero Urbano","doi":"10.1016/j.phro.2024.100668","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and purpose</h3><div>While the inclusion of spatial dose information in deep learning (DL)-based normal-tissue complication probability (NTCP) models has been the focus of recent research studies, external validation is still lacking. This study aimed to externally validate a DL-based NTCP model for mandibular osteoradionecrosis (ORN) trained on 3D radiation dose distribution maps and clinical variables.</div></div><div><h3>Methods and materials</h3><div>A 3D DenseNet-40 convolutional neural network (3D-mDN40) was trained on clinical and radiation dose distribution maps on a retrospective class-balanced matched cohort of 184 subjects. A second model (3D-DN40) was trained on dose maps only and both DL models were compared to a logistic regression (LR) model trained on DVH metrics and clinical variables. All models were externally validated by means of their discriminative ability and calibration on an independent dataset of 82 subjects.</div></div><div><h3>Results</h3><div>No significant difference in performance was observed between models. In internal validation, these exhibited similar Brier scores around 0.2, Log Loss values of 0.6–0.7 and ROC AUC values around 0.7 (internal) and 0.6 (external). Differences in clinical variable distributions and their effect sizes were observed between internal and external cohorts, such as smoking status (0.6 vs. 0.1) and chemotherapy (0.1 vs. −0.5), respectively.</div></div><div><h3>Conclusion</h3><div>To our knowledge, this is the first study to externally validate a multimodality DL-based ORN NTCP model. Utilising mandible dose distribution maps, these models show promise for enhancing spatial risk assessment and guiding dental and oncological decision-making, though further research is essential to address overfitting and domain shift for reliable clinical use.</div></div>","PeriodicalId":36850,"journal":{"name":"Physics and Imaging in Radiation Oncology","volume":"32 ","pages":"Article 100668"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics and Imaging in Radiation Oncology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2405631624001386","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background and purpose
While the inclusion of spatial dose information in deep learning (DL)-based normal-tissue complication probability (NTCP) models has been the focus of recent research studies, external validation is still lacking. This study aimed to externally validate a DL-based NTCP model for mandibular osteoradionecrosis (ORN) trained on 3D radiation dose distribution maps and clinical variables.
Methods and materials
A 3D DenseNet-40 convolutional neural network (3D-mDN40) was trained on clinical and radiation dose distribution maps on a retrospective class-balanced matched cohort of 184 subjects. A second model (3D-DN40) was trained on dose maps only and both DL models were compared to a logistic regression (LR) model trained on DVH metrics and clinical variables. All models were externally validated by means of their discriminative ability and calibration on an independent dataset of 82 subjects.
Results
No significant difference in performance was observed between models. In internal validation, these exhibited similar Brier scores around 0.2, Log Loss values of 0.6–0.7 and ROC AUC values around 0.7 (internal) and 0.6 (external). Differences in clinical variable distributions and their effect sizes were observed between internal and external cohorts, such as smoking status (0.6 vs. 0.1) and chemotherapy (0.1 vs. −0.5), respectively.
Conclusion
To our knowledge, this is the first study to externally validate a multimodality DL-based ORN NTCP model. Utilising mandible dose distribution maps, these models show promise for enhancing spatial risk assessment and guiding dental and oncological decision-making, though further research is essential to address overfitting and domain shift for reliable clinical use.