Lei Li, Matthew A Rysavy, Georgiy Bobashev, Abhik Das
{"title":"比较多类别结果的风险预测方法:二分法逻辑回归与多项式逻辑回归。","authors":"Lei Li, Matthew A Rysavy, Georgiy Bobashev, Abhik Das","doi":"10.1186/s12874-024-02389-x","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide guidance needed for practice.</p><p><strong>Methods: </strong>We described dichotomized logistic regression, multinomial continuation-ratio logit regression, which is an alternative to standard multinomial logit regression for ordinal outcomes, and logistic competing risks regression. We then applied these methods to develop prediction models of survival and neurodevelopmental outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined. Both discrimination and calibration of the estimated logistic models of dichotomized outcomes and continuation-ratio logit model were assessed.</p><p><strong>Results: </strong>The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the dichotomized logistic models could deviate from 100% substantially, ranging from 87.7 to 124.0%, and the dichotomized logistic model of neurodevelopmental impairment greatly overpredicted low risks and underpredicted high risks.</p><p><strong>Conclusions: </strong>Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions for an outcome with multiple ordinal categories. Multinomial continuation-ratio logit regression produces better calibrated predictions, constrains the sum of predicted probabilities to 100%, and has the advantages of simplicity in model interpretation, flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital. It also accounts for mutual dependence among multiple categories and accommodates competing risks.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"24 1","pages":"261"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526521/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression.\",\"authors\":\"Lei Li, Matthew A Rysavy, Georgiy Bobashev, Abhik Das\",\"doi\":\"10.1186/s12874-024-02389-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide guidance needed for practice.</p><p><strong>Methods: </strong>We described dichotomized logistic regression, multinomial continuation-ratio logit regression, which is an alternative to standard multinomial logit regression for ordinal outcomes, and logistic competing risks regression. We then applied these methods to develop prediction models of survival and neurodevelopmental outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined. Both discrimination and calibration of the estimated logistic models of dichotomized outcomes and continuation-ratio logit model were assessed.</p><p><strong>Results: </strong>The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the dichotomized logistic models could deviate from 100% substantially, ranging from 87.7 to 124.0%, and the dichotomized logistic model of neurodevelopmental impairment greatly overpredicted low risks and underpredicted high risks.</p><p><strong>Conclusions: </strong>Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions for an outcome with multiple ordinal categories. Multinomial continuation-ratio logit regression produces better calibrated predictions, constrains the sum of predicted probabilities to 100%, and has the advantages of simplicity in model interpretation, flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital. It also accounts for mutual dependence among multiple categories and accommodates competing risks.</p>\",\"PeriodicalId\":9114,\"journal\":{\"name\":\"BMC Medical Research Methodology\",\"volume\":\"24 1\",\"pages\":\"261\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526521/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Research Methodology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12874-024-02389-x\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-024-02389-x","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression.
Background: Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide guidance needed for practice.
Methods: We described dichotomized logistic regression, multinomial continuation-ratio logit regression, which is an alternative to standard multinomial logit regression for ordinal outcomes, and logistic competing risks regression. We then applied these methods to develop prediction models of survival and neurodevelopmental outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined. Both discrimination and calibration of the estimated logistic models of dichotomized outcomes and continuation-ratio logit model were assessed.
Results: The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the dichotomized logistic models could deviate from 100% substantially, ranging from 87.7 to 124.0%, and the dichotomized logistic model of neurodevelopmental impairment greatly overpredicted low risks and underpredicted high risks.
Conclusions: Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions for an outcome with multiple ordinal categories. Multinomial continuation-ratio logit regression produces better calibrated predictions, constrains the sum of predicted probabilities to 100%, and has the advantages of simplicity in model interpretation, flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital. It also accounts for mutual dependence among multiple categories and accommodates competing risks.
期刊介绍:
BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.