当只知道单个灵敏度和特异度点时，接收器工作特性曲线下的面积和其他性能指标的一般界限

IF 0.6 4区经济学 Q4 BUSINESS, FINANCE

Journal of Risk Model Validation Pub Date : 2022-01-01 DOI:10.21314/jrmv.2022.019

Roger M. Stein

{"title":"当只知道单个灵敏度和特异度点时，接收器工作特性曲线下的面积和其他性能指标的一般界限","authors":"Roger M. Stein","doi":"10.21314/jrmv.2022.019","DOIUrl":null,"url":null,"abstract":"Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited","PeriodicalId":43447,"journal":{"name":"Journal of Risk Model Validation","volume":"1 1","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known\",\"authors\":\"Roger M. Stein\",\"doi\":\"10.21314/jrmv.2022.019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited\",\"PeriodicalId\":43447,\"journal\":{\"name\":\"Journal of Risk Model Validation\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Risk Model Validation\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://doi.org/10.21314/jrmv.2022.019\",\"RegionNum\":4,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Risk Model Validation","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.21314/jrmv.2022.019","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

摘要

受试者工作特征(ROC)曲线通常用于量化诊断、风险分层和评级系统中使用的预测模型的性能。ROC曲线下面积(AUC)在单个统计量中总结了ROC，它也提供了与Mann - Whitney-Wilcoxon检验同构的概率解释。在许多情况下，例如涉及疾病或抗体的诊断测试，有关ROC的信息不报告，而是报告真正的阳性。TP /和真阴性。TN /速率报告单个阈值。我们演示了如何计算ROC AUC的上界和下界。TN /对。我们仅使用简单的几何参数，并给出了两个来自医学和金融的实际应用示例，分别涉及Covid-19诊断和信用卡欺诈检测。此外，我们正式引入了“病态”ROC曲线和“行为良好”ROC曲线的概念。在表现良好的ROC曲线的情况下，AUC的界限可能会更紧。在涉及病理ROC曲线的某些特殊情况下，我们称之为“George Costanza”分类器，我们可以转换预测以获得比原始决策过程具有更高AUC的表现良好的ROC曲线。我们的结果还可以计算其他感兴趣的量，例如科恩d或诊断结果与实际结果之间的皮尔逊相关性。当仅为单个评分阈值报告模型或诊断性能时，这些结果有助于对报告的性能进行直接比较。©2022。盈富数码风险(知识产权)有限公司

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known

Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Risk Model Validation BUSINESS, FINANCE-

CiteScore

1.20

自引率

28.60%

发文量

期刊介绍： As monetary institutions rely greatly on economic and financial models for a wide array of applications, model validation has become progressively inventive within the field of risk. The Journal of Risk Model Validation focuses on the implementation and validation of risk models, and aims to provide a greater understanding of key issues including the empirical evaluation of existing models, pitfalls in model validation and the development of new methods. We also publish papers on back-testing. Our main field of application is in credit risk modelling but we are happy to consider any issues of risk model validation for any financial asset class. The Journal of Risk Model Validation considers submissions in the form of research papers on topics including, but not limited to: Empirical model evaluation studies Backtesting studies Stress-testing studies New methods of model validation/backtesting/stress-testing Best practices in model development, deployment, production and maintenance Pitfalls in model validation techniques (all types of risk, forecasting, pricing and rating)