Comparison of Resampling Methods for Bias-Reduced Estimation of Prediction Error: A Simulation Study Based on Real Datasets from Biomarker Discovery Studies
{"title":"Comparison of Resampling Methods for Bias-Reduced Estimation of Prediction Error: A Simulation Study Based on Real Datasets from Biomarker Discovery Studies","authors":"K. Kakumoto, Y. Tochizawa","doi":"10.5691/JJB.38.17","DOIUrl":null,"url":null,"abstract":"Stepwise logistic regression is the traditional and most commonly used method for identifying biomarkers and evaluating the magnitude of their effects based on clinical data. Here, we evaluated the performance of the resampling methods leave-one-out cross-validation, 10-fold cross-validation, bootstrap, and .632+ bootstrap in terms of internal validation of prediction analysis using stepwise logistic regression. We conducted simulation studies to compare the ability of these methods to estimate prediction accuracy based on simulation settings (including statistical models) derived from two real biomarker discovery studies (Ogata et al., Leukemia Research 2012; 36: 1229–1236; Yoshimi et al., Molecular Psychiatry 2016; 21: 1504–1510). The simulation results revealed that leave-one-out cross-validation, 10-fold cross-validation, and .632+ bootstrap were comparable in terms of the root mean square error. We therefore recommend the application of these methods to similar biomarker discovery studies that involve approximately ten biomarkers with or without binary biomarkers (such as sex) and various degrees of correlation between the biomarkers. samples were defined to evaluate the methods accurately, and the cut off values were set at three values for application of ROC. Our results indicate that the performances of leave-one-out cross-validation, 10-fold cross-validation, and .632+ bootstrap are comparable under previously encountered conditions.","PeriodicalId":365545,"journal":{"name":"Japanese journal of biometrics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Japanese journal of biometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5691/JJB.38.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Stepwise logistic regression is the traditional and most commonly used method for identifying biomarkers and evaluating the magnitude of their effects based on clinical data. Here, we evaluated the performance of the resampling methods leave-one-out cross-validation, 10-fold cross-validation, bootstrap, and .632+ bootstrap in terms of internal validation of prediction analysis using stepwise logistic regression. We conducted simulation studies to compare the ability of these methods to estimate prediction accuracy based on simulation settings (including statistical models) derived from two real biomarker discovery studies (Ogata et al., Leukemia Research 2012; 36: 1229–1236; Yoshimi et al., Molecular Psychiatry 2016; 21: 1504–1510). The simulation results revealed that leave-one-out cross-validation, 10-fold cross-validation, and .632+ bootstrap were comparable in terms of the root mean square error. We therefore recommend the application of these methods to similar biomarker discovery studies that involve approximately ten biomarkers with or without binary biomarkers (such as sex) and various degrees of correlation between the biomarkers. samples were defined to evaluate the methods accurately, and the cut off values were set at three values for application of ROC. Our results indicate that the performances of leave-one-out cross-validation, 10-fold cross-validation, and .632+ bootstrap are comparable under previously encountered conditions.