通用与分层计算机辅助检测阈值在胸部x线结核筛查中的诊断性能。

Joowhan Sung, Peter James Kitonsa, Annet Nalutaaya, David Isooba, Susan Birabwa, Keneth Ndyabayunga, Rogers Okura, Jonathan Magezi, Deborah Nantale, Ivan Mugabi, Violet Nakiiza, David W Dowdy, Achilles Katamba, Emily A Kendall
{"title":"通用与分层计算机辅助检测阈值在胸部x线结核筛查中的诊断性能。","authors":"Joowhan Sung, Peter James Kitonsa, Annet Nalutaaya, David Isooba, Susan Birabwa, Keneth Ndyabayunga, Rogers Okura, Jonathan Magezi, Deborah Nantale, Ivan Mugabi, Violet Nakiiza, David W Dowdy, Achilles Katamba, Emily A Kendall","doi":"10.1101/2025.04.09.25325458","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Computer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics.</p><p><strong>Methods: </strong>We screened for TB in Ugandan individuals aged ≥15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches.</p><p><strong>Findings: </strong>Of 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores ≥0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of ≥0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046).</p><p><strong>Interpretation: </strong>The accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening.</p><p><strong>Funding: </strong>National Institutes of Health.</p><p><strong>Research in context: </strong><b>Evidence before this study:</b> The World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms \"tuberculosis\" AND (\"computer-aided detection\" OR \"computer aided detection\" OR \"CAD\" OR \"computer-aided reading\" OR \"computer aided reading\" OR \"artificial intelligence\"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics.<b>Added value of this study:</b> In this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were ≥15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of ≥90% sensitivity and ≥70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity.<b>Implications of all the available evidence:</b> To obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12036410/pdf/","citationCount":"0","resultStr":"{\"title\":\"Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening.\",\"authors\":\"Joowhan Sung, Peter James Kitonsa, Annet Nalutaaya, David Isooba, Susan Birabwa, Keneth Ndyabayunga, Rogers Okura, Jonathan Magezi, Deborah Nantale, Ivan Mugabi, Violet Nakiiza, David W Dowdy, Achilles Katamba, Emily A Kendall\",\"doi\":\"10.1101/2025.04.09.25325458\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Computer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics.</p><p><strong>Methods: </strong>We screened for TB in Ugandan individuals aged ≥15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches.</p><p><strong>Findings: </strong>Of 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores ≥0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of ≥0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046).</p><p><strong>Interpretation: </strong>The accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening.</p><p><strong>Funding: </strong>National Institutes of Health.</p><p><strong>Research in context: </strong><b>Evidence before this study:</b> The World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms \\\"tuberculosis\\\" AND (\\\"computer-aided detection\\\" OR \\\"computer aided detection\\\" OR \\\"CAD\\\" OR \\\"computer-aided reading\\\" OR \\\"computer aided reading\\\" OR \\\"artificial intelligence\\\"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics.<b>Added value of this study:</b> In this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were ≥15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of ≥90% sensitivity and ≥70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity.<b>Implications of all the available evidence:</b> To obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.</p>\",\"PeriodicalId\":94281,\"journal\":{\"name\":\"medRxiv : the preprint server for health sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12036410/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv : the preprint server for health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.04.09.25325458\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.04.09.25325458","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:计算机辅助检测(CAD)软件分析胸部x光片的特征提示结核(TB),并提供一个数字异常评分。然而,由于缺乏CAD评分较低的人群(包括没有症状的人群)的证实性数据,对一般人群中结核病筛查CAD准确性的估计受到阻碍。此外,获得进一步测试的适当CAD评分阈值可能根据人群和客户特征而变化。方法:我们使用携带CAD的便携式胸部x线(qXR v3)筛查年龄≥15岁的乌干达个体的结核病。无论他们的症状如何,参与者都接受了筛查。x线评分高于预先设定的阈值0.1(范围,0 - 1)的患者被要求提供痰液进行Xpert Ultra测试。当对所有个体使用相同的阈值时(在对x线评分人群中结核病患病率的不同假设下),我们估计了CAD检测专家阳性结核病的诊断准确性。研究发现:在52,835名使用CAD筛查结核病的参与者中,8,949名(16.9%)的x线评分≥0.1。在具有有效Xpert Ultra结果的7,219名参与者中,382名(5.3%)为Xpert阳性,其中81名具有痕量结果。假设0.1%的参与者有x线评分解释:在一般人群中,无论症状如何,CAD筛查结核病的准确性高于先前的估计。基于客户特征(如年龄和性别)对CAD阈值进行分层可以进一步提高准确性,从而实现更有效和个性化的结核病筛查方法。资助:美国国立卫生研究院。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening.

Background: Computer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics.

Methods: We screened for TB in Ugandan individuals aged ≥15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a threshold of 0.1 (range, 0 - 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches.

Findings: Of 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores ≥0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.920 (95% confidence interval 0.898-0.941) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 96.1% specificity, estimated sensitivity was 75.0% for a universal threshold (of ≥0.65) versus 76.9% for thresholds stratified by age and sex (p=0.046).

Interpretation: The accuracy of CAD for TB screening among all screening participants, including those without symptoms or abnormal chest X-rays, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening.

Funding: National Institutes of Health.

Research in context: Evidence before this study: The World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms "tuberculosis" AND ("computer-aided detection" OR "computer aided detection" OR "CAD" OR "computer-aided reading" OR "computer aided reading" OR "artificial intelligence"), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics.Added value of this study: In this study, all consenting individuals in a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were ≥15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.920, and we found that the qXR v3 threshold needed to decrease to under 0.1 to meet the WHO target product profile goal of ≥90% sensitivity and ≥70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and sex strata resulted in a 1 to 2% increase in sensitivity without affecting specificity.Implications of all the available evidence: To obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信