不确定分类:诊断性抗体检测的数据分析

IF 1.5

Mathematical medicine and biology : a journal of the IMA Pub Date : 2021-03-01 DOI:10.1093/imammb/dqab007

Paul N Patrone;Anthony J Kearsley

{"title":"不确定分类:诊断性抗体检测的数据分析","authors":"Paul N Patrone;Anthony J Kearsley","doi":"10.1093/imammb/dqab007","DOIUrl":null,"url":null,"abstract":"Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either (i) defining a third class of hold-out samples that require further testing or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics by connecting them to the broader field of optimization.","PeriodicalId":94130,"journal":{"name":"Mathematical medicine and biology : a journal of the IMA","volume":"38 3","pages":"396-416"},"PeriodicalIF":1.5000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8016811/9579095/09579102.pdf","citationCount":"9","resultStr":"{\"title\":\"Classification under uncertainty: data analysis for diagnostic antibody testing\",\"authors\":\"Paul N Patrone;Anthony J Kearsley\",\"doi\":\"10.1093/imammb/dqab007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either (i) defining a third class of hold-out samples that require further testing or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics by connecting them to the broader field of optimization.\",\"PeriodicalId\":94130,\"journal\":{\"name\":\"Mathematical medicine and biology : a journal of the IMA\",\"volume\":\"38 3\",\"pages\":\"396-416\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2021-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/8016811/9579095/09579102.pdf\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical medicine and biology : a journal of the IMA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9579102/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical medicine and biology : a journal of the IMA","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/9579102/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

制定准确和稳健的分类策略是开发诊断和抗体测试的关键挑战。没有明确说明疾病流行率及其不确定性的方法可能会导致显著的分类错误。我们提出了一种利用最优决策理论来解决这个问题的新方法。作为初步步骤，我们开发了一种分析，该分析使用诊断测量结果的假设患病率和条件概率模型来定义最佳（在最小化假阳性和假阴性率的意义上）分类域。至关重要的是，我们展示了如何通过（i）定义需要进一步测试的第三类保留样本，或（ii）在定义分类域之前使用自适应算法来估计流行率，将该策略推广到流行率未知的环境中。我们还提供了最近发表的严重急性呼吸系统综合征冠状病毒2型血清学测试的例子，并讨论了如何将测量不确定性（例如与仪器相关）纳入分析。我们发现，与基于置信区间的更传统的方法相比，我们的新策略将分类误差减少了多达十年。此外，它通过将接收器操作特性等技术与更广泛的优化领域联系起来，为推广这些技术奠定了理论基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Classification under uncertainty: data analysis for diagnostic antibody testing

Formulating accurate and robust classification strategies is a key challenge of developing diagnostic and antibody tests. Methods that do not explicitly account for disease prevalence and uncertainty therein can lead to significant classification errors. We present a novel method that leverages optimal decision theory to address this problem. As a preliminary step, we develop an analysis that uses an assumed prevalence and conditional probability models of diagnostic measurement outcomes to define optimal (in the sense of minimizing rates of false positives and false negatives) classification domains. Critically, we demonstrate how this strategy can be generalized to a setting in which the prevalence is unknown by either (i) defining a third class of hold-out samples that require further testing or (ii) using an adaptive algorithm to estimate prevalence prior to defining classification domains. We also provide examples for a recently published SARS-CoV-2 serology test and discuss how measurement uncertainty (e.g. associated with instrumentation) can be incorporated into the analysis. We find that our new strategy decreases classification error by up to a decade relative to more traditional methods based on confidence intervals. Moreover, it establishes a theoretical foundation for generalizing techniques such as receiver operating characteristics by connecting them to the broader field of optimization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mathematical medicine and biology : a journal of the IMA

自引率

0.00%

发文量