Tom Young, Victoria Butterworth, Sarah Misson, Delali Adjogatse, Anthony Kong, Imran Petkar, Miguel Reis Ferreira, Mary Lei, Andrew King, Teresa Guerrero Urbano
{"title":"Real-world performance evaluation of commercial autocontouring software for head and neck cancer radiotherapy.","authors":"Tom Young, Victoria Butterworth, Sarah Misson, Delali Adjogatse, Anthony Kong, Imran Petkar, Miguel Reis Ferreira, Mary Lei, Andrew King, Teresa Guerrero Urbano","doi":"10.1093/bjr/tqaf098","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To comprehensively evaluate the ability of ART-Plan (autocontouring software using artificial intelligence [AI]) to contour head and neck cancer (HNC) radiotherapy structures (organs-at-risk and elective nodal volumes) in a real-world setting, as required by NICE guidelines.</p><p><strong>Methods: </strong>Retrospective evaluation (n = 60) compared clinically used contours to AI-contours, using volumetric dice similarity coefficient (VDSC) and blinded radiation oncologist (RO) contour preference assessment. AI-contours were then generated prospectively for HNC radiotherapy patients (n = 61), before review by ROs. ROs recorded qualitative scoring and review/editing time. Geometric and dosimetric comparison of AI-contours and final contours was undertaken. Correlation coefficients between all metrics were calculated.</p><p><strong>Results: </strong>Retrospective median VDSC varied widely for different structures (0.23-0.88). 31.4% blinded contour assessments preferred clinician-generated contours, 32.9% preferred AI-generated contours, 35.7% saw no significant difference. Prospective evaluation showed AI-contour yielded significant time-saving in reviewing/editing for all structures compared to manual contouring. Qualitative scores demonstrated most structures had median scoring ≥4 (indicating no editing required). Geometric metrics showed high similarity for all structures except larynx. Dosimetric evaluation demonstrated clinically significant dose differences for larynx and elective nodal volumes. Strong correlation was seen between qualitative scoring and all geometric metrics.</p><p><strong>Conclusions: </strong>AI-contours showed excellent qualitative performance and facilitated time-saving. Protocol differences exist between commercial AI-solutions and implementing centres. Ongoing final clinician review of AI-contours remains essential.</p><p><strong>Advances in knowledge: </strong>This is the first study demonstrating ART-Plan's capability for excellent qualitative ratings and significant time-savings for HNC radiotherapy contouring in a real-world workflow setting. 5-point Likert-scale qualitative scoring correlates strongly with geometric metrics.</p>","PeriodicalId":9306,"journal":{"name":"British Journal of Radiology","volume":" ","pages":"1632-1641"},"PeriodicalIF":3.4000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12515040/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/bjr/tqaf098","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: To comprehensively evaluate the ability of ART-Plan (autocontouring software using artificial intelligence [AI]) to contour head and neck cancer (HNC) radiotherapy structures (organs-at-risk and elective nodal volumes) in a real-world setting, as required by NICE guidelines.
Methods: Retrospective evaluation (n = 60) compared clinically used contours to AI-contours, using volumetric dice similarity coefficient (VDSC) and blinded radiation oncologist (RO) contour preference assessment. AI-contours were then generated prospectively for HNC radiotherapy patients (n = 61), before review by ROs. ROs recorded qualitative scoring and review/editing time. Geometric and dosimetric comparison of AI-contours and final contours was undertaken. Correlation coefficients between all metrics were calculated.
Results: Retrospective median VDSC varied widely for different structures (0.23-0.88). 31.4% blinded contour assessments preferred clinician-generated contours, 32.9% preferred AI-generated contours, 35.7% saw no significant difference. Prospective evaluation showed AI-contour yielded significant time-saving in reviewing/editing for all structures compared to manual contouring. Qualitative scores demonstrated most structures had median scoring ≥4 (indicating no editing required). Geometric metrics showed high similarity for all structures except larynx. Dosimetric evaluation demonstrated clinically significant dose differences for larynx and elective nodal volumes. Strong correlation was seen between qualitative scoring and all geometric metrics.
Conclusions: AI-contours showed excellent qualitative performance and facilitated time-saving. Protocol differences exist between commercial AI-solutions and implementing centres. Ongoing final clinician review of AI-contours remains essential.
Advances in knowledge: This is the first study demonstrating ART-Plan's capability for excellent qualitative ratings and significant time-savings for HNC radiotherapy contouring in a real-world workflow setting. 5-point Likert-scale qualitative scoring correlates strongly with geometric metrics.
期刊介绍:
BJR is the international research journal of the British Institute of Radiology and is the oldest scientific journal in the field of radiology and related sciences.
Dating back to 1896, BJR’s history is radiology’s history, and the journal has featured some landmark papers such as the first description of Computed Tomography "Computerized transverse axial tomography" by Godfrey Hounsfield in 1973. A valuable historical resource, the complete BJR archive has been digitized from 1896.
Quick Facts:
- 2015 Impact Factor – 1.840
- Receipt to first decision – average of 6 weeks
- Acceptance to online publication – average of 3 weeks
- ISSN: 0007-1285
- eISSN: 1748-880X
Open Access option