Roger M. Stein
求助PDF
{"title":"当只知道单个灵敏度和特异度点时,接收器工作特性曲线下的面积和其他性能指标的一般界限","authors":"Roger M. Stein","doi":"10.21314/jrmv.2022.019","DOIUrl":null,"url":null,"abstract":"Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited","PeriodicalId":43447,"journal":{"name":"Journal of Risk Model Validation","volume":"1 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known\",\"authors\":\"Roger M. Stein\",\"doi\":\"10.21314/jrmv.2022.019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited\",\"PeriodicalId\":43447,\"journal\":{\"name\":\"Journal of Risk Model Validation\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Risk Model Validation\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://doi.org/10.21314/jrmv.2022.019\",\"RegionNum\":4,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Risk Model Validation","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.21314/jrmv.2022.019","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0
引用
批量引用
General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known
Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited