Utility of artificial intelligence in radiosurgery for pituitary adenoma: a deep learning-based automated segmentation model and evaluation of its clinical applicability.
Martin Černý, Jaromír May, Lucie Hamáčková, Hana Hallak, Josef Novotný, Denis Baručić, Jan Kybic, Michaela May, Martin Májovský, Michael J Link, Neevya Balasubramaniam, Dalibor Síla, Miriam Babničová, David Netuka, Roman Liščák
{"title":"Utility of artificial intelligence in radiosurgery for pituitary adenoma: a deep learning-based automated segmentation model and evaluation of its clinical applicability.","authors":"Martin Černý, Jaromír May, Lucie Hamáčková, Hana Hallak, Josef Novotný, Denis Baručić, Jan Kybic, Michaela May, Martin Májovský, Michael J Link, Neevya Balasubramaniam, Dalibor Síla, Miriam Babničová, David Netuka, Roman Liščák","doi":"10.3171/2024.12.JNS242167","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The objective of this study was to develop a deep learning model for automated pituitary adenoma segmentation in MRI scans for stereotactic radiosurgery planning and to assess its accuracy and efficiency in clinical settings.</p><p><strong>Methods: </strong>An nnU-Net-based model was trained on MRI scans with expert segmentations of 582 patients treated with Leksell Gamma Knife over the course of 12 years. The accuracy of the model was evaluated by a human expert on a separate dataset of 146 previously unseen patients. The primary outcome was the comparison of expert ratings between the predicted segmentations and a control group consisting of original manual segmentations. Secondary outcomes were the influence of tumor volume, previous surgery, previous stereotactic radiosurgery (SRS), and endocrinological status on expert ratings, performance in a subgroup of nonfunctioning macroadenomas (measuring 1000-4000 mm3) without previous surgery and/or radiosurgery, and influence of using additional MRI modalities as model input and time cost reduction.</p><p><strong>Results: </strong>The model achieved Dice similarity coefficients of 82.3%, 63.9%, and 79.6% for tumor, normal gland, and optic nerve, respectively. A human expert rated 20.6% of the segmentations as applicable in treatment planning without any modifications, 52.7% as applicable with minor manual modifications, and 26.7% as inapplicable. The ratings for predicted segmentations were lower than for the control group of original segmentations (p < 0.001). Larger tumor volume, history of a previous radiosurgery, and nonfunctioning pituitary adenoma were associated with better expert ratings (p = 0.005, p = 0.007, and p < 0.001, respectively). In the subgroup without previous surgery, although expert ratings were more favorable, the association did not reach statistical significance (p = 0.074). In the subgroup of noncomplex cases (n = 9), 55.6% of the segmentations were rated as applicable without any manual modifications and no segmentations were rated as inapplicable. Manually improving inaccurate segmentations instead of creating them from scratch led to 53.6% reduction of the time cost (p < 0.001).</p><p><strong>Conclusions: </strong>The results were applicable for treatment planning with either no or minor manual modifications, demonstrating a significant increase in the efficiency of the planning process. The predicted segmentations can be loaded into the planning software used in clinical practice for treatment planning. The authors discuss some considerations of the clinical utility of the automated segmentation models, as well as their integration within established clinical workflows, and outline directions for future research.</p>","PeriodicalId":16505,"journal":{"name":"Journal of neurosurgery","volume":" ","pages":"1-10"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neurosurgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3171/2024.12.JNS242167","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The objective of this study was to develop a deep learning model for automated pituitary adenoma segmentation in MRI scans for stereotactic radiosurgery planning and to assess its accuracy and efficiency in clinical settings.
Methods: An nnU-Net-based model was trained on MRI scans with expert segmentations of 582 patients treated with Leksell Gamma Knife over the course of 12 years. The accuracy of the model was evaluated by a human expert on a separate dataset of 146 previously unseen patients. The primary outcome was the comparison of expert ratings between the predicted segmentations and a control group consisting of original manual segmentations. Secondary outcomes were the influence of tumor volume, previous surgery, previous stereotactic radiosurgery (SRS), and endocrinological status on expert ratings, performance in a subgroup of nonfunctioning macroadenomas (measuring 1000-4000 mm3) without previous surgery and/or radiosurgery, and influence of using additional MRI modalities as model input and time cost reduction.
Results: The model achieved Dice similarity coefficients of 82.3%, 63.9%, and 79.6% for tumor, normal gland, and optic nerve, respectively. A human expert rated 20.6% of the segmentations as applicable in treatment planning without any modifications, 52.7% as applicable with minor manual modifications, and 26.7% as inapplicable. The ratings for predicted segmentations were lower than for the control group of original segmentations (p < 0.001). Larger tumor volume, history of a previous radiosurgery, and nonfunctioning pituitary adenoma were associated with better expert ratings (p = 0.005, p = 0.007, and p < 0.001, respectively). In the subgroup without previous surgery, although expert ratings were more favorable, the association did not reach statistical significance (p = 0.074). In the subgroup of noncomplex cases (n = 9), 55.6% of the segmentations were rated as applicable without any manual modifications and no segmentations were rated as inapplicable. Manually improving inaccurate segmentations instead of creating them from scratch led to 53.6% reduction of the time cost (p < 0.001).
Conclusions: The results were applicable for treatment planning with either no or minor manual modifications, demonstrating a significant increase in the efficiency of the planning process. The predicted segmentations can be loaded into the planning software used in clinical practice for treatment planning. The authors discuss some considerations of the clinical utility of the automated segmentation models, as well as their integration within established clinical workflows, and outline directions for future research.
期刊介绍:
The Journal of Neurosurgery, Journal of Neurosurgery: Spine, Journal of Neurosurgery: Pediatrics, and Neurosurgical Focus are devoted to the publication of original works relating primarily to neurosurgery, including studies in clinical neurophysiology, organic neurology, ophthalmology, radiology, pathology, and molecular biology. The Editors and Editorial Boards encourage submission of clinical and laboratory studies. Other manuscripts accepted for review include technical notes on instruments or equipment that are innovative or useful to clinicians and researchers in the field of neuroscience; papers describing unusual cases; manuscripts on historical persons or events related to neurosurgery; and in Neurosurgical Focus, occasional reviews. Letters to the Editor commenting on articles recently published in the Journal of Neurosurgery, Journal of Neurosurgery: Spine, and Journal of Neurosurgery: Pediatrics are welcome.