Explainable SHAP-XGBoost models for identifying important social factors associated with the atherosclerotic cardiovascular disease risk score using the LASSO feature selection technique.
IF 2.2 4区 医学Q2 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Jungtae Choi, Jooeun Jeon, Hyoeun An, Hyeon Chang Kim
{"title":"Explainable SHAP-XGBoost models for identifying important social factors associated with the atherosclerotic cardiovascular disease risk score using the LASSO feature selection technique.","authors":"Jungtae Choi, Jooeun Jeon, Hyoeun An, Hyeon Chang Kim","doi":"10.4178/epih.e2025052","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Extensive evidence indicates that social factors play an essential role in explaining atherosclerotic cardiovascular disease (ASCVD). This study aimed to examine which social factors are associated with the estimated 10-year ASCVD risk score among male and female adults, incorporating both multifaceted social network components and conventional risk factors.</p><p><strong>Methods: </strong>Using data from 4368 middle-aged Korean adults, we explored factors most likely to explain ASCVD risk with interpretable machine learning algorithms. The ASCVD risk was determined using the 10-year ASCVD risk score, as calculated using pooled cohort equations. Social network components were assessed through the name generator module. A total of 52 variables were included in the model.</p><p><strong>Results: </strong>For male participants (area under the receiver operating characteristic curve [AUC] = 0.65), the average years known for network members contributed most to ASCVD risk prediction (mean Shapley additive explanations [SHAP] value = 0.31), followed by spouse's education level (0.22), medical history with diagnosis (0.18), and snoring frequency (0.14). By contrast, for female participants (AUC = 0.60), medical history with diagnosis was the strongest predictor (0.47), followed by logged income (0.21), education level (0.19), and the average number of years known in network members (0.17).</p><p><strong>Conclusion: </strong>Several important social factors were associated with the ASCVD risk score in both male and female adults. However, longitudinal research is needed to determine whether these factors predict future ASCVD events.</p>","PeriodicalId":48543,"journal":{"name":"Epidemiology and Health","volume":" ","pages":"e2025052"},"PeriodicalIF":2.2000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology and Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4178/epih.e2025052","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Extensive evidence indicates that social factors play an essential role in explaining atherosclerotic cardiovascular disease (ASCVD). This study aimed to examine which social factors are associated with the estimated 10-year ASCVD risk score among male and female adults, incorporating both multifaceted social network components and conventional risk factors.
Methods: Using data from 4368 middle-aged Korean adults, we explored factors most likely to explain ASCVD risk with interpretable machine learning algorithms. The ASCVD risk was determined using the 10-year ASCVD risk score, as calculated using pooled cohort equations. Social network components were assessed through the name generator module. A total of 52 variables were included in the model.
Results: For male participants (area under the receiver operating characteristic curve [AUC] = 0.65), the average years known for network members contributed most to ASCVD risk prediction (mean Shapley additive explanations [SHAP] value = 0.31), followed by spouse's education level (0.22), medical history with diagnosis (0.18), and snoring frequency (0.14). By contrast, for female participants (AUC = 0.60), medical history with diagnosis was the strongest predictor (0.47), followed by logged income (0.21), education level (0.19), and the average number of years known in network members (0.17).
Conclusion: Several important social factors were associated with the ASCVD risk score in both male and female adults. However, longitudinal research is needed to determine whether these factors predict future ASCVD events.
期刊介绍:
Epidemiology and Health (epiH) is an electronic journal publishing papers in all areas of epidemiology and public health. It is indexed on PubMed Central and the scope is wide-ranging: including descriptive, analytical and molecular epidemiology; primary preventive measures; screening approaches and secondary prevention; clinical epidemiology; and all aspects of communicable and non-communicable diseases prevention. The epiH publishes original research, and also welcomes review articles and meta-analyses, cohort profiles and data profiles, epidemic and case investigations, descriptions and applications of new methods, and discussions of research theory or public health policy. We give special consideration to papers from developing countries.