David Bernard, Emmanuel Doumard, Isabelle Ader, Philippe Kemoun, Jean-Christophe Pagès, Anne Galinier, Sylvain Cussat-Blanc, Felix Furger, Luigi Ferrucci, Julien Aligon, Cyrille Delpierre, Luc Pénicaud, Paul Monsarrat, Louis Casteilla
{"title":"可解释的机器学习框架,以预测个性化的生理衰老","authors":"David Bernard, Emmanuel Doumard, Isabelle Ader, Philippe Kemoun, Jean-Christophe Pagès, Anne Galinier, Sylvain Cussat-Blanc, Felix Furger, Luigi Ferrucci, Julien Aligon, Cyrille Delpierre, Luc Pénicaud, Paul Monsarrat, Louis Casteilla","doi":"10.1111/acel.13872","DOIUrl":null,"url":null,"abstract":"<p>Attaining personalized healthy aging requires accurate monitoring of physiological changes and identifying subclinical markers that predict accelerated or delayed aging. Classic biostatistical methods most rely on supervised variables to estimate physiological aging and do not capture the full complexity of inter-parameter interactions. Machine learning (ML) is promising, but its black box nature eludes direct understanding, substantially limiting physician confidence and clinical usage. Using a broad population dataset from the National Health and Nutrition Examination Survey (NHANES) study including routine biological variables and after selection of XGBoost as the most appropriate algorithm, we created an innovative explainable ML framework to determine a Personalized physiological age (PPA). PPA predicted both chronic disease and mortality independently of chronological age. Twenty-six variables were sufficient to predict PPA. Using SHapley Additive exPlanations (SHAP), we implemented a precise quantitative associated metric for each variable explaining physiological (i.e., accelerated or delayed) deviations from age-specific normative data. Among the variables, glycated hemoglobin (HbA1c) displays a major relative weight in the estimation of PPA. Finally, clustering profiles of identical contextualized explanations reveal different aging trajectories opening opportunities to specific clinical follow-up. These data show that PPA is a robust, quantitative and explainable ML-based metric that monitors personalized health status. Our approach also provides a complete framework applicable to different datasets or variables, allowing precision physiological age estimation.</p>","PeriodicalId":119,"journal":{"name":"Aging Cell","volume":"22 8","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2023-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/acel.13872","citationCount":"3","resultStr":"{\"title\":\"Explainable machine learning framework to predict personalized physiological aging\",\"authors\":\"David Bernard, Emmanuel Doumard, Isabelle Ader, Philippe Kemoun, Jean-Christophe Pagès, Anne Galinier, Sylvain Cussat-Blanc, Felix Furger, Luigi Ferrucci, Julien Aligon, Cyrille Delpierre, Luc Pénicaud, Paul Monsarrat, Louis Casteilla\",\"doi\":\"10.1111/acel.13872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Attaining personalized healthy aging requires accurate monitoring of physiological changes and identifying subclinical markers that predict accelerated or delayed aging. Classic biostatistical methods most rely on supervised variables to estimate physiological aging and do not capture the full complexity of inter-parameter interactions. Machine learning (ML) is promising, but its black box nature eludes direct understanding, substantially limiting physician confidence and clinical usage. Using a broad population dataset from the National Health and Nutrition Examination Survey (NHANES) study including routine biological variables and after selection of XGBoost as the most appropriate algorithm, we created an innovative explainable ML framework to determine a Personalized physiological age (PPA). PPA predicted both chronic disease and mortality independently of chronological age. Twenty-six variables were sufficient to predict PPA. Using SHapley Additive exPlanations (SHAP), we implemented a precise quantitative associated metric for each variable explaining physiological (i.e., accelerated or delayed) deviations from age-specific normative data. Among the variables, glycated hemoglobin (HbA1c) displays a major relative weight in the estimation of PPA. Finally, clustering profiles of identical contextualized explanations reveal different aging trajectories opening opportunities to specific clinical follow-up. These data show that PPA is a robust, quantitative and explainable ML-based metric that monitors personalized health status. Our approach also provides a complete framework applicable to different datasets or variables, allowing precision physiological age estimation.</p>\",\"PeriodicalId\":119,\"journal\":{\"name\":\"Aging Cell\",\"volume\":\"22 8\",\"pages\":\"\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2023-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/acel.13872\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Aging Cell\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/acel.13872\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aging Cell","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/acel.13872","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
Explainable machine learning framework to predict personalized physiological aging
Attaining personalized healthy aging requires accurate monitoring of physiological changes and identifying subclinical markers that predict accelerated or delayed aging. Classic biostatistical methods most rely on supervised variables to estimate physiological aging and do not capture the full complexity of inter-parameter interactions. Machine learning (ML) is promising, but its black box nature eludes direct understanding, substantially limiting physician confidence and clinical usage. Using a broad population dataset from the National Health and Nutrition Examination Survey (NHANES) study including routine biological variables and after selection of XGBoost as the most appropriate algorithm, we created an innovative explainable ML framework to determine a Personalized physiological age (PPA). PPA predicted both chronic disease and mortality independently of chronological age. Twenty-six variables were sufficient to predict PPA. Using SHapley Additive exPlanations (SHAP), we implemented a precise quantitative associated metric for each variable explaining physiological (i.e., accelerated or delayed) deviations from age-specific normative data. Among the variables, glycated hemoglobin (HbA1c) displays a major relative weight in the estimation of PPA. Finally, clustering profiles of identical contextualized explanations reveal different aging trajectories opening opportunities to specific clinical follow-up. These data show that PPA is a robust, quantitative and explainable ML-based metric that monitors personalized health status. Our approach also provides a complete framework applicable to different datasets or variables, allowing precision physiological age estimation.
Aging CellBiochemistry, Genetics and Molecular Biology-Cell Biology
自引率
2.60%
发文量
212
期刊介绍:
Aging Cell is an Open Access journal that focuses on the core aspects of the biology of aging, encompassing the entire spectrum of geroscience. The journal's content is dedicated to publishing research that uncovers the mechanisms behind the aging process and explores the connections between aging and various age-related diseases. This journal aims to provide a comprehensive understanding of the biological underpinnings of aging and its implications for human health.
The journal is widely recognized and its content is abstracted and indexed by numerous databases and services, which facilitates its accessibility and impact in the scientific community. These include:
Academic Search (EBSCO Publishing)
Academic Search Alumni Edition (EBSCO Publishing)
Academic Search Premier (EBSCO Publishing)
Biological Science Database (ProQuest)
CAS: Chemical Abstracts Service (ACS)
Embase (Elsevier)
InfoTrac (GALE Cengage)
Ingenta Select
ISI Alerting Services
Journal Citation Reports/Science Edition (Clarivate Analytics)
MEDLINE/PubMed (NLM)
Natural Science Collection (ProQuest)
PubMed Dietary Supplement Subset (NLM)
Science Citation Index Expanded (Clarivate Analytics)
SciTech Premium Collection (ProQuest)
Web of Science (Clarivate Analytics)
Being indexed in these databases ensures that the research published in Aging Cell is discoverable by researchers, clinicians, and other professionals interested in the field of aging and its associated health issues. This broad coverage helps to disseminate the journal's findings and contributes to the advancement of knowledge in geroscience.