Ekaba Bisong, Noor Jibril, Preethi Premnath, Elsy Buligwa, George Oboh, Adanna Chukwuma
{"title":"Predicting high blood pressure using machine learning models in low- and middle-income countries.","authors":"Ekaba Bisong, Noor Jibril, Preethi Premnath, Elsy Buligwa, George Oboh, Adanna Chukwuma","doi":"10.1186/s12911-024-02634-9","DOIUrl":null,"url":null,"abstract":"<p><p>Responding to the rising global prevalence of noncommunicable diseases (NCDs) requires improvements in the management of high blood pressure. Therefore, this study aims to develop an explainable machine learning model for predicting high blood pressure, a key NCD risk factor, using data from the STEPwise approach to NCD risk factor surveillance (STEPS) surveys. Nationally representative samples of adults aged 18-69 years were acquired from 57 countries spanning six World Health Organization (WHO) regions. Data harmonization and processing were performed to standardize the selected predictors and synchronize features across countries, yielding 41 variables, including demographic, behavioural, physical, and biochemical factors. Five machine learning models - logistic regression, k-nearest neighbours, random forest, XGBoost, and a fully connected neural network - were trained and evaluated at global, regional, and country-specific levels using an 80/20 train-test split. The models' performance was assessed using accuracy, precision, recall, and F1 score. Feature importance analysis identified age, weight, heart rate, waist circumference, and height as key predictors of blood pressure. Across the 57 countries studied, model performances varied considerably, with accuracy ranging from as low as 58.96% in some models for specific countries to as high as 81.41% in others, underscoring the need for region and country-specific adaptations in modelling approaches. The explainable model offers an opportunity for population-level screening and continuous risk assessment in resource-limited settings.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11342471/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02634-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Responding to the rising global prevalence of noncommunicable diseases (NCDs) requires improvements in the management of high blood pressure. Therefore, this study aims to develop an explainable machine learning model for predicting high blood pressure, a key NCD risk factor, using data from the STEPwise approach to NCD risk factor surveillance (STEPS) surveys. Nationally representative samples of adults aged 18-69 years were acquired from 57 countries spanning six World Health Organization (WHO) regions. Data harmonization and processing were performed to standardize the selected predictors and synchronize features across countries, yielding 41 variables, including demographic, behavioural, physical, and biochemical factors. Five machine learning models - logistic regression, k-nearest neighbours, random forest, XGBoost, and a fully connected neural network - were trained and evaluated at global, regional, and country-specific levels using an 80/20 train-test split. The models' performance was assessed using accuracy, precision, recall, and F1 score. Feature importance analysis identified age, weight, heart rate, waist circumference, and height as key predictors of blood pressure. Across the 57 countries studied, model performances varied considerably, with accuracy ranging from as low as 58.96% in some models for specific countries to as high as 81.41% in others, underscoring the need for region and country-specific adaptations in modelling approaches. The explainable model offers an opportunity for population-level screening and continuous risk assessment in resource-limited settings.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.