Virginia De Martin Topranin, Atle Wiig-Fisketjøn, Emma Botten, Håvard Dalen, Mette Langaas, Anja Bye
{"title":"Sex-specific cardiovascular disease risk prediction using statistical learning and explainable artificial intelligence: the HUNT Study.","authors":"Virginia De Martin Topranin, Atle Wiig-Fisketjøn, Emma Botten, Håvard Dalen, Mette Langaas, Anja Bye","doi":"10.1093/eurjpc/zwaf135","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>Current risk prediction models, such as the Norwegian NORRISK 2, explain only a modest proportion of cardiovascular disease (CVD) incidence. This study aimed to develop improved sex-specific models for predicting the 10-year CVD risk as well as sex- and age-specific thresholds for intervention.</p><p><strong>Methods: </strong>Data from 31,946 participants (40-79 years) without prior CVD were analyzed. Data were randomly split into a training set (for estimation) and a test set (for model evaluation). An extreme gradient boosting (XGBoost) model was used to identify the most important predictive variables. Next, prediction models were developed on the training set for each sex separately using XGBoost and logistic regression. The models were evaluated on the test set using receiver-operating characteristic (ROC) and precision recall (PR) curves. Finally, age- and sex-specific thresholds for intervention were explored.</p><p><strong>Results: </strong>All traditional risk factors included in NORRISK 2 and the European SCORE2 model were important predictors for males, but not for females. Potential new risk predictors were identified. The XGBoost model improved CVD risk prediction for males: 0.013- and 0.012-unit increase in ROC-AUC compared to NORRISK 2 and SCORE2 respectively, and 12% and 11% increase in PR-AUC respectively. For females, neither the XGBoost nor logistic regression model performed significantly better than NORRISK 2 and SCORE2. Age- and sex-specific thresholds showed an improvement in sensitivity compared with NORRISK 2-suggested thresholds.</p><p><strong>Conclusions: </strong>By employing statistical learning and incorporating sex-specific risk factors, we propose improved risk prediction models for CVD in males. Introducing sex-specific thresholds for intervention could enhance CVD prevention for both sexes.</p>","PeriodicalId":12051,"journal":{"name":"European journal of preventive cardiology","volume":" ","pages":""},"PeriodicalIF":8.4000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European journal of preventive cardiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/eurjpc/zwaf135","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Aims: Current risk prediction models, such as the Norwegian NORRISK 2, explain only a modest proportion of cardiovascular disease (CVD) incidence. This study aimed to develop improved sex-specific models for predicting the 10-year CVD risk as well as sex- and age-specific thresholds for intervention.
Methods: Data from 31,946 participants (40-79 years) without prior CVD were analyzed. Data were randomly split into a training set (for estimation) and a test set (for model evaluation). An extreme gradient boosting (XGBoost) model was used to identify the most important predictive variables. Next, prediction models were developed on the training set for each sex separately using XGBoost and logistic regression. The models were evaluated on the test set using receiver-operating characteristic (ROC) and precision recall (PR) curves. Finally, age- and sex-specific thresholds for intervention were explored.
Results: All traditional risk factors included in NORRISK 2 and the European SCORE2 model were important predictors for males, but not for females. Potential new risk predictors were identified. The XGBoost model improved CVD risk prediction for males: 0.013- and 0.012-unit increase in ROC-AUC compared to NORRISK 2 and SCORE2 respectively, and 12% and 11% increase in PR-AUC respectively. For females, neither the XGBoost nor logistic regression model performed significantly better than NORRISK 2 and SCORE2. Age- and sex-specific thresholds showed an improvement in sensitivity compared with NORRISK 2-suggested thresholds.
Conclusions: By employing statistical learning and incorporating sex-specific risk factors, we propose improved risk prediction models for CVD in males. Introducing sex-specific thresholds for intervention could enhance CVD prevention for both sexes.
期刊介绍:
European Journal of Preventive Cardiology (EJPC) is an official journal of the European Society of Cardiology (ESC) and the European Association of Preventive Cardiology (EAPC). The journal covers a wide range of scientific, clinical, and public health disciplines related to cardiovascular disease prevention, risk factor management, cardiovascular rehabilitation, population science and public health, and exercise physiology. The categories covered by the journal include classical risk factors and treatment, lifestyle risk factors, non-modifiable cardiovascular risk factors, cardiovascular conditions, concomitant pathological conditions, sport cardiology, diagnostic tests, care settings, epidemiology, pharmacology and pharmacotherapy, machine learning, and artificial intelligence.