Multidisciplinary consensus prostate contours on magnetic resonance imaging: educational atlas and reference standard for artificial intelligence benchmarking.
Yuze Song, Anna Dornisch, Robert T Dess, Daniel Ja Margolis, Eric P Weinberg, Tristan Barrett, Mariel Cornell, Richard E Fan, Mukesh Harisinghani, Sophia C Kamran, Jeong Hoon Lee, Cynthia Xinran Li, Michael A Liss, Mirabela Rusu, Jason Santos, Geoffrey A Sonn, Igor Vidic, Sean A Woolen, Anders M Dale, Tyler M Seibert
{"title":"Multidisciplinary consensus prostate contours on magnetic resonance imaging: educational atlas and reference standard for artificial intelligence benchmarking.","authors":"Yuze Song, Anna Dornisch, Robert T Dess, Daniel Ja Margolis, Eric P Weinberg, Tristan Barrett, Mariel Cornell, Richard E Fan, Mukesh Harisinghani, Sophia C Kamran, Jeong Hoon Lee, Cynthia Xinran Li, Michael A Liss, Mirabela Rusu, Jason Santos, Geoffrey A Sonn, Igor Vidic, Sean A Woolen, Anders M Dale, Tyler M Seibert","doi":"10.1016/j.ijrobp.2025.03.024","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to (1) create a reference standard dataset with precise prostate contours by expert consensus and (2) evaluate various AI tools against this standard.</p><p><strong>Materials and methods: </strong>We obtained prostate MRI cases from XXX. A panel of four experts (two genitourinary radiologists, two prostate radiation oncologists) meticulously developed consensus prostate segmentations on axial T<sub>2</sub>-weighted series. We evaluated the performance of six AI tools (three commercially available, three academic) using Dice scores, distance from reference contour, and volume error.</p><p><strong>Results: </strong>The panel achieved consensus prostate segmentation on each slice of all 68 patient cases included in the reference dataset. We present two patient examples to serve as contouring guides. Depending on the AI tool, median Dice scores (across patients) ranged from 0.80 to 0.94 for whole prostate segmentation. For a typical (median) patient, AI tools had a mean error over the prostate surface ranging from 1.3 to 2.4 mm. They maximally deviated 3.0 to 9.4 mm outside the prostate and 3.0 to 8.5 mm inside the prostate for a typical patient. Error in prostate volume measurement for a typical patient ranged from 4.3% to 31.4%.</p><p><strong>Discussion: </strong>We established an expert consensus benchmark for prostate segmentation. The best-performing AI tools have typical accuracy greater than that reported for radiation oncologists using CT scans (most common clinical approach for radiotherapy planning). Physician review remains essential to detect occasional major errors.</p>","PeriodicalId":14215,"journal":{"name":"International Journal of Radiation Oncology Biology Physics","volume":" ","pages":""},"PeriodicalIF":6.4000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Radiation Oncology Biology Physics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ijrobp.2025.03.024","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Evaluation of artificial intelligence (AI) algorithms for prostate segmentation is challenging because ground truth is lacking. We aimed to (1) create a reference standard dataset with precise prostate contours by expert consensus and (2) evaluate various AI tools against this standard.
Materials and methods: We obtained prostate MRI cases from XXX. A panel of four experts (two genitourinary radiologists, two prostate radiation oncologists) meticulously developed consensus prostate segmentations on axial T2-weighted series. We evaluated the performance of six AI tools (three commercially available, three academic) using Dice scores, distance from reference contour, and volume error.
Results: The panel achieved consensus prostate segmentation on each slice of all 68 patient cases included in the reference dataset. We present two patient examples to serve as contouring guides. Depending on the AI tool, median Dice scores (across patients) ranged from 0.80 to 0.94 for whole prostate segmentation. For a typical (median) patient, AI tools had a mean error over the prostate surface ranging from 1.3 to 2.4 mm. They maximally deviated 3.0 to 9.4 mm outside the prostate and 3.0 to 8.5 mm inside the prostate for a typical patient. Error in prostate volume measurement for a typical patient ranged from 4.3% to 31.4%.
Discussion: We established an expert consensus benchmark for prostate segmentation. The best-performing AI tools have typical accuracy greater than that reported for radiation oncologists using CT scans (most common clinical approach for radiotherapy planning). Physician review remains essential to detect occasional major errors.
期刊介绍:
International Journal of Radiation Oncology • Biology • Physics (IJROBP), known in the field as the Red Journal, publishes original laboratory and clinical investigations related to radiation oncology, radiation biology, medical physics, and both education and health policy as it relates to the field.
This journal has a particular interest in original contributions of the following types: prospective clinical trials, outcomes research, and large database interrogation. In addition, it seeks reports of high-impact innovations in single or combined modality treatment, tumor sensitization, normal tissue protection (including both precision avoidance and pharmacologic means), brachytherapy, particle irradiation, and cancer imaging. Technical advances related to dosimetry and conformal radiation treatment planning are of interest, as are basic science studies investigating tumor physiology and the molecular biology underlying cancer and normal tissue radiation response.