Mahek Shergill, Steve Durant, Sharon Birdi, Roxana Rabet, Carolyn Ziegler, Shehzad Ali, David Buckeridge, Marzyeh Ghassemi, Jennifer Gibson, Ava John-Baptiste, Jillian Macklin, Melissa McCradden, Kwame McKenzie, Parisa Naraei, Akwasi Owusu-Bempah, Laura C Rosella, James Shaw, Ross Upshur, Sharmistha Mishra, Andrew D Pinto
{"title":"Machine learning used to study risk factors for chronic diseases: A scoping review.","authors":"Mahek Shergill, Steve Durant, Sharon Birdi, Roxana Rabet, Carolyn Ziegler, Shehzad Ali, David Buckeridge, Marzyeh Ghassemi, Jennifer Gibson, Ava John-Baptiste, Jillian Macklin, Melissa McCradden, Kwame McKenzie, Parisa Naraei, Akwasi Owusu-Bempah, Laura C Rosella, James Shaw, Ross Upshur, Sharmistha Mishra, Andrew D Pinto","doi":"10.17269/s41997-025-01059-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Machine learning (ML) has received significant attention for its potential to process and learn from vast amounts of data. Our aim was to perform a scoping review to identify studies that used ML to study risk factors for chronic diseases at a population level, notably those that incorporated methods to mitigate algorithmic bias. We focused on ML applications for the most common risk factors for chronic disease: tobacco use, alcohol use, unhealthy eating, physical activity, and psychological stress.</p><p><strong>Methods: </strong>We searched the peer-reviewed, indexed literature using Medline (Ovid), Embase (Ovid), Cochrane Central Register of Controlled Trials and Cochrane Database of Systematic Reviews (Ovid), Scopus, ACM Digital Library, INSPEC, and Web of Science's Science Citation Index, Social Sciences Citation Index, and Emerging Sources Citation Index. Among the included studies, we examined whether bias was considered and identified strategies employed to mitigate bias.</p><p><strong>Synthesis: </strong>The search identified 10,329 studies, and 20 met our inclusion criteria. The studies we identified used ML for a wide range of goals, from prediction of chronic disease development to automating the classification of data to identifying new associations between risk factors and disease. Nine studies (45%) included some discussion of algorithmic bias. Studies that incorporated a broad array of sociodemographic variables did so primarily to improve the performance of a ML model rather than to mitigate potential harms to populations made vulnerable by social and economic policies.</p><p><strong>Conclusion: </strong>This work contributes to our understanding of how ML can be used to advance population and public health.</p>","PeriodicalId":51407,"journal":{"name":"Canadian Journal of Public Health-Revue Canadienne De Sante Publique","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Public Health-Revue Canadienne De Sante Publique","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.17269/s41997-025-01059-9","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Machine learning (ML) has received significant attention for its potential to process and learn from vast amounts of data. Our aim was to perform a scoping review to identify studies that used ML to study risk factors for chronic diseases at a population level, notably those that incorporated methods to mitigate algorithmic bias. We focused on ML applications for the most common risk factors for chronic disease: tobacco use, alcohol use, unhealthy eating, physical activity, and psychological stress.
Methods: We searched the peer-reviewed, indexed literature using Medline (Ovid), Embase (Ovid), Cochrane Central Register of Controlled Trials and Cochrane Database of Systematic Reviews (Ovid), Scopus, ACM Digital Library, INSPEC, and Web of Science's Science Citation Index, Social Sciences Citation Index, and Emerging Sources Citation Index. Among the included studies, we examined whether bias was considered and identified strategies employed to mitigate bias.
Synthesis: The search identified 10,329 studies, and 20 met our inclusion criteria. The studies we identified used ML for a wide range of goals, from prediction of chronic disease development to automating the classification of data to identifying new associations between risk factors and disease. Nine studies (45%) included some discussion of algorithmic bias. Studies that incorporated a broad array of sociodemographic variables did so primarily to improve the performance of a ML model rather than to mitigate potential harms to populations made vulnerable by social and economic policies.
Conclusion: This work contributes to our understanding of how ML can be used to advance population and public health.
目标:机器学习(ML)因其处理和学习大量数据的潜力而受到广泛关注。我们的目的是进行范围审查,以确定在人群水平上使用ML研究慢性疾病危险因素的研究,特别是那些采用减轻算法偏差方法的研究。我们专注于ML在慢性病最常见危险因素方面的应用:吸烟、饮酒、不健康饮食、身体活动和心理压力。方法:使用Medline (Ovid)、Embase (Ovid)、Cochrane中央对照试验注册库和Cochrane系统评价数据库(Ovid)、Scopus、ACM数字图书馆、INSPEC和Web of Science的科学引文索引、社会科学引文索引和新兴资源引文索引检索同行评议、索引的文献。在纳入的研究中,我们检查了是否考虑偏倚,并确定了减轻偏倚的策略。综合:检索到10,329项研究,其中20项符合我们的纳入标准。我们确定的研究将ML用于广泛的目标,从慢性疾病发展的预测到数据的自动化分类,再到识别风险因素和疾病之间的新关联。9项研究(45%)包含了对算法偏差的一些讨论。纳入广泛社会人口变量的研究主要是为了提高机器学习模型的性能,而不是为了减轻社会和经济政策对弱势群体的潜在危害。结论:这项工作有助于我们理解机器学习如何用于促进人口和公共健康。
期刊介绍:
The Canadian Journal of Public Health is dedicated to fostering excellence in public health research, scholarship, policy and practice. The aim of the Journal is to advance public health research and practice in Canada and around the world, thus contributing to the improvement of the health of populations and the reduction of health inequalities.
CJPH publishes original research and scholarly articles submitted in either English or French that are relevant to population and public health.
CJPH is an independent, peer-reviewed journal owned by the Canadian Public Health Association and published by Springer.
Énoncé de mission
La Revue canadienne de santé publique se consacre à promouvoir l’excellence dans la recherche, les travaux d’érudition, les politiques et les pratiques de santé publique. Son but est de faire progresser la recherche et les pratiques de santé publique au Canada et dans le monde, contribuant ainsi à l’amélioration de la santé des populations et à la réduction des inégalités de santé.
La RCSP publie des articles savants et des travaux inédits, soumis en anglais ou en français, qui sont d’intérêt pour la santé publique et des populations.
La RCSP est une revue indépendante avec comité de lecture, propriété de l’Association canadienne de santé publique et publiée par Springer.