Behavior of test specificity under an imperfect gold standard: findings from a simulation study and analysis of real-world oncology data.

IF 3.9 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

BMC Medical Research Methodology Pub Date : 2025-05-30 DOI:10.1186/s12874-025-02603-4

Mark S Walker, Lukas Slipski, Yanina Natanzon

{"title":"Behavior of test specificity under an imperfect gold standard: findings from a simulation study and analysis of real-world oncology data.","authors":"Mark S Walker, Lukas Slipski, Yanina Natanzon","doi":"10.1186/s12874-025-02603-4","DOIUrl":null,"url":null,"abstract":"Background: Gold standards used in validation of new tests may be imperfect, with sensitivity or specificity less than 100%. The impact of imperfection in a gold standard on measured test attributes has been demonstrated formally, but its relevance in real-world oncology research may not be well understood.Methods: This simulation study examined the impact of imperfect gold standard sensitivity on measured test specificity at different levels of condition prevalence for a hypothetical real-world measure of death. The study also evaluated real-world oncology datasets with a linked National Death Index (NDI) dataset, to examine the measured specificity of a death indicator at levels of death prevalence that matched the simulation. The simulation and real-world data analysis both examined measured specificity of the death indicator at death prevalence ranging from 50 to 98%. To isolate the effects of death prevalence and imperfect gold standard sensitivity, the simulation assumed a test with perfect sensitivity and specificity, and with perfect gold standard specificity. However, gold standard sensitivity was modeled at values from 90 to 99%.Results: Results of the simulation showed that decreasing gold standard sensitivity was associated with increasing underestimation of test specificity, and that the extent of underestimation increased with higher death prevalence. Analysis of the real-world data yielded findings that closely matched the simulation pattern. At 98% death prevalence, near-perfect gold standard sensitivity (99%) still resulted in suppression of specificity from the true value of 100% to the measured value of < 67%.Conclusions: New validation research, and review of existing validation studies, should consider the prevalence of the conditions assessed by a measure, and the possible impact on sensitivity and specificity of an imperfect gold standard.","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"151"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12125893/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-025-02603-4","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Gold standards used in validation of new tests may be imperfect, with sensitivity or specificity less than 100%. The impact of imperfection in a gold standard on measured test attributes has been demonstrated formally, but its relevance in real-world oncology research may not be well understood.

Methods: This simulation study examined the impact of imperfect gold standard sensitivity on measured test specificity at different levels of condition prevalence for a hypothetical real-world measure of death. The study also evaluated real-world oncology datasets with a linked National Death Index (NDI) dataset, to examine the measured specificity of a death indicator at levels of death prevalence that matched the simulation. The simulation and real-world data analysis both examined measured specificity of the death indicator at death prevalence ranging from 50 to 98%. To isolate the effects of death prevalence and imperfect gold standard sensitivity, the simulation assumed a test with perfect sensitivity and specificity, and with perfect gold standard specificity. However, gold standard sensitivity was modeled at values from 90 to 99%.

Results: Results of the simulation showed that decreasing gold standard sensitivity was associated with increasing underestimation of test specificity, and that the extent of underestimation increased with higher death prevalence. Analysis of the real-world data yielded findings that closely matched the simulation pattern. At 98% death prevalence, near-perfect gold standard sensitivity (99%) still resulted in suppression of specificity from the true value of 100% to the measured value of < 67%.

Conclusions: New validation research, and review of existing validation studies, should consider the prevalence of the conditions assessed by a measure, and the possible impact on sensitivity and specificity of an imperfect gold standard.

查看原文本刊更多论文

不完美金标准下测试特异性的行为：来自模拟研究和分析真实肿瘤数据的发现。

背景：用于验证新检测的金标准可能不完美，灵敏度或特异性低于100%。金标准的不完善对测量测试属性的影响已被正式证明，但其在现实世界肿瘤研究中的相关性可能尚未得到很好的理解。方法：本模拟研究考察了在假设的真实世界死亡测量中，不完美金标准敏感性对不同疾病患病率水平下测量的测试特异性的影响。该研究还使用相关的国家死亡指数（NDI）数据集评估了现实世界的肿瘤数据集，以检查与模拟相匹配的死亡流行水平上死亡指标的测量特异性。模拟和真实世界的数据分析都检查了死亡患病率在50%至98%之间的死亡指标的测量特异性。为了分离死亡流行率和不完美金标准灵敏度的影响，模拟假设具有完美的灵敏度和特异性，并具有完美的金标准特异性。然而，金标准灵敏度的模拟值为90%至99%。结果：模拟结果显示，金标准灵敏度的降低与测试特异性低估的增加有关，并且低估的程度随着死亡率的增加而增加。对真实世界数据的分析得出的结果与模拟模式非常吻合。在98%的死亡发生率下，接近完美的金标准灵敏度（99%）仍然导致特异性从真实值100%到测量值的抑制。结论：新的验证研究，以及对现有验证研究的回顾，应考虑通过测量评估的条件的患病率，以及不完美金标准对敏感性和特异性的可能影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Research Methodology 医学-卫生保健

CiteScore

6.50

自引率

2.50%

发文量

298

审稿时长

3-8 weeks

期刊介绍： BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.