Amaia Iparragirre, Irantzu Barrio, M. Rodríguez-Álvarez
{"title":"On the optimism correction of the area under the receiver operating characteristic curve in logistic prediction models","authors":"Amaia Iparragirre, Irantzu Barrio, M. Rodríguez-Álvarez","doi":"10.2436/20.8080.02.82","DOIUrl":null,"url":null,"abstract":"When the same data are used to fit a model and estimate its predictive performance, this estimate may be optimistic, and its correction is required. The aim of this work is to compare the behaviour of different methods proposed in the literature when correcting for the optimism of the estimated area under the receiver operating characteristic curve in logistic regression models. A simulation study (where the theoretical model is known) is conducted considering different number of covariates, sample size, prevalence and correlation among covariates. The results suggest the use of k-fold cross-validation with replication and bootstrap.","PeriodicalId":49497,"journal":{"name":"Sort-Statistics and Operations Research Transactions","volume":"74 1","pages":"145-162"},"PeriodicalIF":0.7000,"publicationDate":"2019-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sort-Statistics and Operations Research Transactions","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.2436/20.8080.02.82","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 2
Abstract
When the same data are used to fit a model and estimate its predictive performance, this estimate may be optimistic, and its correction is required. The aim of this work is to compare the behaviour of different methods proposed in the literature when correcting for the optimism of the estimated area under the receiver operating characteristic curve in logistic regression models. A simulation study (where the theoretical model is known) is conducted considering different number of covariates, sample size, prevalence and correlation among covariates. The results suggest the use of k-fold cross-validation with replication and bootstrap.
期刊介绍:
SORT (Statistics and Operations Research Transactions) —formerly Qüestiió— is an international journal launched in 2003. It is published twice-yearly, in English, by the Statistical Institute of Catalonia (Idescat). The journal is co-edited by the Universitat Politècnica de Catalunya, Universitat de Barcelona, Universitat Autonòma de Barcelona, Universitat de Girona, Universitat Pompeu Fabra i Universitat de Lleida, with the co-operation of the Spanish Section of the International Biometric Society and the Catalan Statistical Society. SORT promotes the publication of original articles of a methodological or applied nature or motivated by an applied problem in statistics, operations research, official statistics or biometrics as well as book reviews. We encourage authors to include an example of a real data set in their manuscripts.