Building and validating a predictive model for stroke risk in Chinese community-dwelling patients with chronic obstructive pulmonary disease using machine learning methods
{"title":"Building and validating a predictive model for stroke risk in Chinese community-dwelling patients with chronic obstructive pulmonary disease using machine learning methods","authors":"Yong Chen, Yonglin Yu, Dongmei Yang, Xiaoju Chen","doi":"10.1101/2024.09.12.24313533","DOIUrl":null,"url":null,"abstract":"Abstract\nBackground: The occurrence of stroke in patients with chronic obstructive pulmonary disease (COPD) can have potentially devastating consequences; however, there is still a lack of predictive models that accurately predict the risk of stroke in community-based COPD patients in China. The aim of this study was to construct a novel predictive model that accurately predicts the predictive model for the risk of stroke in community-based COPD patients by applying a machine learning methodology within the Chinese community. Methods: The clinical data of 809 Community COPD patients were analyzed by using the 2020 China Health and Retirement Longitudinal Study (CHARLS) database. The least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression were used to analyze predictors. Multiple machine learning (ML) classification models are integrated to analyze and identify the optimal model, and Shapley Additive exPlanations (SHAP) interpretation was developed for personalized risk assessment.Results:The following six variables:Heart_disease,Hyperlipidemia,Hypertension,ADL_score, Cesd_score and Parkinson are predictors of stroke in community-based COPD patients. Logistic classification model was the optimal model, test set area under curve (AUC) (95% confidence interval, CI):0.913 (0.835-0.992), accuracy: 0.823, sensitivity: 0.818, and specificity: 0.823.\nConclusions: The model constructed in this study has relatively reliable predictive performance, which helps clinical doctors identify high-risk populations of community COPD patients prone to stroke at an early stage.","PeriodicalId":501074,"journal":{"name":"medRxiv - Respiratory Medicine","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Respiratory Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.12.24313533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract
Background: The occurrence of stroke in patients with chronic obstructive pulmonary disease (COPD) can have potentially devastating consequences; however, there is still a lack of predictive models that accurately predict the risk of stroke in community-based COPD patients in China. The aim of this study was to construct a novel predictive model that accurately predicts the predictive model for the risk of stroke in community-based COPD patients by applying a machine learning methodology within the Chinese community. Methods: The clinical data of 809 Community COPD patients were analyzed by using the 2020 China Health and Retirement Longitudinal Study (CHARLS) database. The least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression were used to analyze predictors. Multiple machine learning (ML) classification models are integrated to analyze and identify the optimal model, and Shapley Additive exPlanations (SHAP) interpretation was developed for personalized risk assessment.Results:The following six variables:Heart_disease,Hyperlipidemia,Hypertension,ADL_score, Cesd_score and Parkinson are predictors of stroke in community-based COPD patients. Logistic classification model was the optimal model, test set area under curve (AUC) (95% confidence interval, CI):0.913 (0.835-0.992), accuracy: 0.823, sensitivity: 0.818, and specificity: 0.823.
Conclusions: The model constructed in this study has relatively reliable predictive performance, which helps clinical doctors identify high-risk populations of community COPD patients prone to stroke at an early stage.