Development and validation of a machine learning model for predicting vulnerable carotid plaques using routine blood biomarkers and derived indicators: insights into sex-related risk patterns.
{"title":"Development and validation of a machine learning model for predicting vulnerable carotid plaques using routine blood biomarkers and derived indicators: insights into sex-related risk patterns.","authors":"Yimin E, Zhichao Yao, Maolin Ge, Guijun Huo, Jian Huang, Yao Tang, Zhanao Liu, Ziyi Tan, Yuqi Zeng, Junjie Cao, Dayong Zhou","doi":"10.1186/s12933-025-02867-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early detection of vulnerable carotid plaques is critical for stroke prevention. This study aimed to develop a machine learning model based on routine blood tests and derived indices to predict plaque vulnerability and assess sex-specific risk patterns across biomarker value ranges.</p><p><strong>Methods: </strong>We retrospectively included 1701 hospitalized patients from Suzhou Municipal Hospital (2019-2020), selected from an initial cohort of 10,028 individuals. All patients underwent carotid ultrasound, with vulnerable plaques identified using predefined imaging criteria. A total of 30 laboratory variables-including blood count, coagulation, and biochemistry-were extracted, alongside derived indices such as triglyceride-glucose index (TyG), atherogenic index of plasma (AIP), neutrophil-to-lymphocyte ratio (NLR) and others. Features were standardized and selected based on statistical and clinical relevance. Five machine learning models were trained using a 7:3 train-test split and evaluated by cross-validation. Model performance was assessed using AUC, sensitivity, and specificity. The best model was interpreted using SHapley Additive exPlanations (SHAP) analysis. Sex differences were explored using Mann-Whitney U tests and restricted cubic spline (RCS) modeling across value intervals.</p><p><strong>Results: </strong>The Random Forest model showed the highest predictive performance (AUC = 0.847; 95% CI 0.791-0.895; specificity = 89.4%; sensitivity = 64.2%). SHAP analysis identified gender, age, fibrinogen, NLR, creatinine, fasting blood glucose, uric acid to high-density lipoprotein ratio (UHR), TyG, systemic inflammation response index (SIRI), and lymphocyte count as top predictors. Significant sex-specific differences in SHAP values were observed for key biomarkers, including age, UHR, TyG, SIRI, and others. RCS modeling further revealed distinct sex-related patterns in plaque vulnerability across biomarker value ranges.</p><p><strong>Conclusion: </strong>A Random Forest model integrating routine blood markers and derived indices accurately predicted vulnerable carotid plaques. The results underscore the importance of sex-specific risk assessment, highlighting differential effects of key biomarkers across genders and value intervals.</p>","PeriodicalId":9374,"journal":{"name":"Cardiovascular Diabetology","volume":"24 1","pages":"326"},"PeriodicalIF":10.6000,"publicationDate":"2025-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12337436/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cardiovascular Diabetology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12933-025-02867-6","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Early detection of vulnerable carotid plaques is critical for stroke prevention. This study aimed to develop a machine learning model based on routine blood tests and derived indices to predict plaque vulnerability and assess sex-specific risk patterns across biomarker value ranges.
Methods: We retrospectively included 1701 hospitalized patients from Suzhou Municipal Hospital (2019-2020), selected from an initial cohort of 10,028 individuals. All patients underwent carotid ultrasound, with vulnerable plaques identified using predefined imaging criteria. A total of 30 laboratory variables-including blood count, coagulation, and biochemistry-were extracted, alongside derived indices such as triglyceride-glucose index (TyG), atherogenic index of plasma (AIP), neutrophil-to-lymphocyte ratio (NLR) and others. Features were standardized and selected based on statistical and clinical relevance. Five machine learning models were trained using a 7:3 train-test split and evaluated by cross-validation. Model performance was assessed using AUC, sensitivity, and specificity. The best model was interpreted using SHapley Additive exPlanations (SHAP) analysis. Sex differences were explored using Mann-Whitney U tests and restricted cubic spline (RCS) modeling across value intervals.
Results: The Random Forest model showed the highest predictive performance (AUC = 0.847; 95% CI 0.791-0.895; specificity = 89.4%; sensitivity = 64.2%). SHAP analysis identified gender, age, fibrinogen, NLR, creatinine, fasting blood glucose, uric acid to high-density lipoprotein ratio (UHR), TyG, systemic inflammation response index (SIRI), and lymphocyte count as top predictors. Significant sex-specific differences in SHAP values were observed for key biomarkers, including age, UHR, TyG, SIRI, and others. RCS modeling further revealed distinct sex-related patterns in plaque vulnerability across biomarker value ranges.
Conclusion: A Random Forest model integrating routine blood markers and derived indices accurately predicted vulnerable carotid plaques. The results underscore the importance of sex-specific risk assessment, highlighting differential effects of key biomarkers across genders and value intervals.
期刊介绍:
Cardiovascular Diabetology is a journal that welcomes manuscripts exploring various aspects of the relationship between diabetes, cardiovascular health, and the metabolic syndrome. We invite submissions related to clinical studies, genetic investigations, experimental research, pharmacological studies, epidemiological analyses, and molecular biology research in this field.