Automatic contour quality assurance using deep-learning based contours.

IF 3.3 3区 医学 Q2 ENGINEERING, BIOMEDICAL
Barbara Marquez, David Fuentes, Christine B Peterson, Dong Joo Rhee, Raphael J Douglas, Raymond P Mumme, Anuja Jhingran, Julianne M Pollard, Surendra Prajapati, Thomas Whitaker, Laurence E Court
{"title":"Automatic contour quality assurance using deep-learning based contours.","authors":"Barbara Marquez, David Fuentes, Christine B Peterson, Dong Joo Rhee, Raphael J Douglas, Raymond P Mumme, Anuja Jhingran, Julianne M Pollard, Surendra Prajapati, Thomas Whitaker, Laurence E Court","doi":"10.1088/1361-6560/ade5e6","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Safe deployment of auto-contouring models requires the inclusion of automated QA. One such approach is to use two independent auto-contouring models and compare them geometrically for acceptability. This is not effective because geometric differences may not correlate with clinically significant errors. Herein, we investigated whether a two-contour QA system is improved by including dose in this comparison.&#xD;Approach. VMAT plans were generated for 86 head and neck (H&N) and 50 cervical (GYN) cancer patients, using clinically-approved PTVs and auto-contour OARs from a primary auto-contouring model. Doses to the primary OARs were compared with doses to manually drawn and approved OARs (\"the truth\"). A difference in Dmean or Dmax ≥ 2 Gy was identified as a reporting error (Derror). A second, independent auto-contouring model was then used to contour the OARs (verification). The primary and verification auto-contouring models were compared geometrically (DSC, sDSC, HD95, MSD) and dosimetrically (Dmean, Dmax). The ability of comparison metrics between the two auto-contouring models to flag actual dosimetric errors (i.e. primary model compared with the truth) was investigated. A logistic regression model was used to predict Derror. The data was divided by disease site and into 50/50 stratified training and testing sets; k-fold cross validation was employed during training to avoid overfitting. H&N structures were further divided into size-specific groups to improve model performance and generalizability.&#xD;Main Results. Including dose metrics in the logistic regression model to predict Derror, mean increased the performance in terms of ROC-AUC and AU-PRC in the test set for H&N small structures. For Derror, max, including dose metrics increased performance for H&N small structures, H&N medium structures, and GYN structures. &#xD;Significance. In many instances, utilizing dose with geometric comparisons can improve the ability of a verification model to flag potential errors from a primary auto-contouring model.&#xD.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/ade5e6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: Safe deployment of auto-contouring models requires the inclusion of automated QA. One such approach is to use two independent auto-contouring models and compare them geometrically for acceptability. This is not effective because geometric differences may not correlate with clinically significant errors. Herein, we investigated whether a two-contour QA system is improved by including dose in this comparison. Approach. VMAT plans were generated for 86 head and neck (H&N) and 50 cervical (GYN) cancer patients, using clinically-approved PTVs and auto-contour OARs from a primary auto-contouring model. Doses to the primary OARs were compared with doses to manually drawn and approved OARs ("the truth"). A difference in Dmean or Dmax ≥ 2 Gy was identified as a reporting error (Derror). A second, independent auto-contouring model was then used to contour the OARs (verification). The primary and verification auto-contouring models were compared geometrically (DSC, sDSC, HD95, MSD) and dosimetrically (Dmean, Dmax). The ability of comparison metrics between the two auto-contouring models to flag actual dosimetric errors (i.e. primary model compared with the truth) was investigated. A logistic regression model was used to predict Derror. The data was divided by disease site and into 50/50 stratified training and testing sets; k-fold cross validation was employed during training to avoid overfitting. H&N structures were further divided into size-specific groups to improve model performance and generalizability. Main Results. Including dose metrics in the logistic regression model to predict Derror, mean increased the performance in terms of ROC-AUC and AU-PRC in the test set for H&N small structures. For Derror, max, including dose metrics increased performance for H&N small structures, H&N medium structures, and GYN structures. Significance. In many instances, utilizing dose with geometric comparisons can improve the ability of a verification model to flag potential errors from a primary auto-contouring model. .

使用基于深度学习的轮廓自动轮廓质量保证。
目的:自动轮廓模型的安全部署需要包含自动QA。其中一种方法是使用两个独立的自动轮廓模型,并对它们进行几何上的可接受性比较。这是无效的,因为几何差异可能与临床显著误差无关。在此,我们研究了在这种比较方法中加入剂量是否改善了双轮廓QA系统。 ;使用临床批准的ptv和来自初级自动轮廓模型的自动轮廓OARs,为86例头颈部(H&N)和50例宫颈(GYN)癌症患者生成VMAT计划。将主要桨的剂量与人工绘制和批准的桨的剂量进行比较(“真相”)。Dmean或Dmax≥2 Gy的差异被认定为报告错误(error)。然后使用第二个独立的自动轮廓模型来轮廓桨(验证)。初步和验证的自动轮廓模型进行几何(DSC, sDSC, HD95, MSD)和剂量学(Dmean, Dmax)的比较。研究了两种自动轮廓模型之间的比较指标标记实际剂量学误差(即初级模型与真实模型的比较)的能力。采用逻辑回归模型预测误差。数据按疾病部位分成50/50的分层训练和测试集;训练时采用K-fold交叉验证,避免过拟合。H&N结构进一步划分为特定尺寸组,以提高模型性能和可泛化性。 ;在逻辑回归模型中加入剂量计量来预测误差,在H&N小结构的测试集中,平均值提高了ROC-AUC和AU-PRC的性能。对于error, max,包括剂量指标提高了H&N小型结构,H&N中型结构和GYN结构的性能。& # xD;意义。在许多情况下,利用剂量与几何比较可以提高验证模型标记初级自动轮廓模型潜在错误的能力。 。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Physics in medicine and biology
Physics in medicine and biology 医学-工程:生物医学
CiteScore
6.50
自引率
14.30%
发文量
409
审稿时长
2 months
期刊介绍: The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信