Siona Prasad, Sabina A. Murphy, David A. Morrow, Benjamin S. Scirica, Marc S. Sabatine, David D. Berg, Andrea Bellavia
{"title":"机器学习和深度学习方法在临床流行病学中具有事件时间结果的预测建模中的应用。方法概括性与可解释性的比较与实践思考。","authors":"Siona Prasad, Sabina A. Murphy, David A. Morrow, Benjamin S. Scirica, Marc S. Sabatine, David D. Berg, Andrea Bellavia","doi":"10.1016/j.annepidem.2025.10.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Clinical prediction models (CPM) are essential tools for diagnosis and prognosis in clinical epidemiology. Machine learning (ML) and deep learning (DL) approaches provide flexible methods that can complement regression-based methods for CPM when complex predictors such as clinical biomarkers are of interest. However, concerns have been raised on the ability of ML and DL to address desired properties of CPMs such as parsimony, generalizability, and interpretability.</div></div><div><h3>Methods</h3><div>In this study, we evaluated and applied selected regression-based, ML and DL approaches for time-to-event outcomes in a clinical study integrating protein biomarkers and lipids in an existing CPM for cardiovascular risk.</div></div><div><h3>Results</h3><div>We observed considerable advantages from the application of gradient boosting machines (GBM: C-statistic=0.72; Brier Score=0.052), which provided the best balance between model flexibility, discrimination, calibration, and parsimony, the latter being directly related to one of the model parameters (shrinking rate). Further, GBM results can be used for individual risk prediction, providing an interpretable tool for CPM implementation.</div></div><div><h3>Conclusions</h3><div>We compared ML and DL methods for CPM with time-to-event outcomes and discussed practical aspects of their implementation in clinical epidemiology including generalizability and interpretability. Adequately trained ML approaches can provide advantages in prediction modeling, especially when integrating complex predictors.</div></div>","PeriodicalId":50767,"journal":{"name":"Annals of Epidemiology","volume":"111 ","pages":"Pages 186-192"},"PeriodicalIF":3.0000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of machine learning and deep learning approaches for prediction modeling with time-to-event outcomes in clinical epidemiology. Methods comparison and practical considerations for generalizability and interpretability\",\"authors\":\"Siona Prasad, Sabina A. Murphy, David A. Morrow, Benjamin S. Scirica, Marc S. Sabatine, David D. Berg, Andrea Bellavia\",\"doi\":\"10.1016/j.annepidem.2025.10.012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><div>Clinical prediction models (CPM) are essential tools for diagnosis and prognosis in clinical epidemiology. Machine learning (ML) and deep learning (DL) approaches provide flexible methods that can complement regression-based methods for CPM when complex predictors such as clinical biomarkers are of interest. However, concerns have been raised on the ability of ML and DL to address desired properties of CPMs such as parsimony, generalizability, and interpretability.</div></div><div><h3>Methods</h3><div>In this study, we evaluated and applied selected regression-based, ML and DL approaches for time-to-event outcomes in a clinical study integrating protein biomarkers and lipids in an existing CPM for cardiovascular risk.</div></div><div><h3>Results</h3><div>We observed considerable advantages from the application of gradient boosting machines (GBM: C-statistic=0.72; Brier Score=0.052), which provided the best balance between model flexibility, discrimination, calibration, and parsimony, the latter being directly related to one of the model parameters (shrinking rate). Further, GBM results can be used for individual risk prediction, providing an interpretable tool for CPM implementation.</div></div><div><h3>Conclusions</h3><div>We compared ML and DL methods for CPM with time-to-event outcomes and discussed practical aspects of their implementation in clinical epidemiology including generalizability and interpretability. Adequately trained ML approaches can provide advantages in prediction modeling, especially when integrating complex predictors.</div></div>\",\"PeriodicalId\":50767,\"journal\":{\"name\":\"Annals of Epidemiology\",\"volume\":\"111 \",\"pages\":\"Pages 186-192\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1047279725003096\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047279725003096","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
Application of machine learning and deep learning approaches for prediction modeling with time-to-event outcomes in clinical epidemiology. Methods comparison and practical considerations for generalizability and interpretability
Purpose
Clinical prediction models (CPM) are essential tools for diagnosis and prognosis in clinical epidemiology. Machine learning (ML) and deep learning (DL) approaches provide flexible methods that can complement regression-based methods for CPM when complex predictors such as clinical biomarkers are of interest. However, concerns have been raised on the ability of ML and DL to address desired properties of CPMs such as parsimony, generalizability, and interpretability.
Methods
In this study, we evaluated and applied selected regression-based, ML and DL approaches for time-to-event outcomes in a clinical study integrating protein biomarkers and lipids in an existing CPM for cardiovascular risk.
Results
We observed considerable advantages from the application of gradient boosting machines (GBM: C-statistic=0.72; Brier Score=0.052), which provided the best balance between model flexibility, discrimination, calibration, and parsimony, the latter being directly related to one of the model parameters (shrinking rate). Further, GBM results can be used for individual risk prediction, providing an interpretable tool for CPM implementation.
Conclusions
We compared ML and DL methods for CPM with time-to-event outcomes and discussed practical aspects of their implementation in clinical epidemiology including generalizability and interpretability. Adequately trained ML approaches can provide advantages in prediction modeling, especially when integrating complex predictors.
期刊介绍:
The journal emphasizes the application of epidemiologic methods to issues that affect the distribution and determinants of human illness in diverse contexts. Its primary focus is on chronic and acute conditions of diverse etiologies and of major importance to clinical medicine, public health, and health care delivery.