A comparative analysis of binary and multi-class classification machine learning algorithms to detect current frailty status using the English longitudinal study of ageing (ELSA).

IF 3.3 Q2 GERIATRICS & GERONTOLOGY

Frontiers in aging Pub Date : 2025-04-22 eCollection Date: 2025-01-01 DOI:10.3389/fragi.2025.1501168

Charmayne Mary Lee Hughes, Yan Zhang, Ali Pourhossein, Terezia Jurasova

{"title":"A comparative analysis of binary and multi-class classification machine learning algorithms to detect current frailty status using the English longitudinal study of ageing (ELSA).","authors":"Charmayne Mary Lee Hughes, Yan Zhang, Ali Pourhossein, Terezia Jurasova","doi":"10.3389/fragi.2025.1501168","DOIUrl":null,"url":null,"abstract":"Background: Physical frailty is a pressing public health issue that significantly increases the risk of disability, hospitalization, and mortality. Early and accurate detection of frailty is essential for timely intervention, reducing its widespread impact on healthcare systems, social support networks, and economic stability.Objective: This study aimed to classify frailty status into binary (frail vs. non-frail) and multi-class (frail vs. pre-frail vs. non-frail) categories. The goal was to detect and classify frailty status at a specific point in time. Model development and internal validation were conducted using data from wave 8 of the English Longitudinal Study of Ageing (ELSA), with external validation using wave 6 data to assess model generalizability.Methods: Nine classification algorithms, including Logistic Regression, Random Forest, K-nearest Neighbor, Gradient Boosting, AdaBoost, XGBoost, LightGBM, CatBoost, and Multi-Layer Perceptron, were evaluated and their performance compared.Results: CatBoost demonstrated the best overall performance in binary classification, achieving high recall (0.951), balanced accuracy (0.928), and the lowest Brier score (0.049) on the internal validation set, and maintaining strong performance externally with a recall of 0.950, balanced accuracy of 0.913, and F1-score of 0.951. Multi-class classification was more challenging, with Gradient Boosting emerging as the top model, achieving the highest recall (0.666) and precision (0.663) on the external validation set, with a strong F1-score (0.664) and reasonable calibration (Brier Score = 0.223).Conclusion: Machine learning algorithms show promise for the detection of current frailty status, particularly in binary classification. However, distinguishing between frailty subcategories remains challenging, highlighting the need for improved models and feature selection strategies to enhance multi-class classification accuracy.","PeriodicalId":73061,"journal":{"name":"Frontiers in aging","volume":"6 ","pages":"1501168"},"PeriodicalIF":3.3000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12052818/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in aging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fragi.2025.1501168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GERIATRICS & GERONTOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Physical frailty is a pressing public health issue that significantly increases the risk of disability, hospitalization, and mortality. Early and accurate detection of frailty is essential for timely intervention, reducing its widespread impact on healthcare systems, social support networks, and economic stability.

Objective: This study aimed to classify frailty status into binary (frail vs. non-frail) and multi-class (frail vs. pre-frail vs. non-frail) categories. The goal was to detect and classify frailty status at a specific point in time. Model development and internal validation were conducted using data from wave 8 of the English Longitudinal Study of Ageing (ELSA), with external validation using wave 6 data to assess model generalizability.

Methods: Nine classification algorithms, including Logistic Regression, Random Forest, K-nearest Neighbor, Gradient Boosting, AdaBoost, XGBoost, LightGBM, CatBoost, and Multi-Layer Perceptron, were evaluated and their performance compared.

Results: CatBoost demonstrated the best overall performance in binary classification, achieving high recall (0.951), balanced accuracy (0.928), and the lowest Brier score (0.049) on the internal validation set, and maintaining strong performance externally with a recall of 0.950, balanced accuracy of 0.913, and F1-score of 0.951. Multi-class classification was more challenging, with Gradient Boosting emerging as the top model, achieving the highest recall (0.666) and precision (0.663) on the external validation set, with a strong F1-score (0.664) and reasonable calibration (Brier Score = 0.223).

Conclusion: Machine learning algorithms show promise for the detection of current frailty status, particularly in binary classification. However, distinguishing between frailty subcategories remains challenging, highlighting the need for improved models and feature selection strategies to enhance multi-class classification accuracy.

查看原文本刊更多论文

使用英国老龄化纵向研究（ELSA）对二元和多类分类机器学习算法进行比较分析，以检测当前的脆弱状态。

背景：身体虚弱是一个紧迫的公共卫生问题，它显著增加了残疾、住院和死亡的风险。早期和准确发现脆弱性对于及时干预至关重要，从而减少其对卫生保健系统、社会支持网络和经济稳定的广泛影响。目的：本研究旨在将虚弱状态分为二元（虚弱与非虚弱）和多类别（虚弱、虚弱前期与非虚弱）。目标是在特定的时间点检测和分类脆弱状态。使用英国老龄化纵向研究（ELSA）第8波数据进行模型开发和内部验证，使用第6波数据进行外部验证以评估模型的普遍性。方法：对Logistic回归、随机森林、k近邻、梯度增强、AdaBoost、XGBoost、LightGBM、CatBoost、Multi-Layer Perceptron等9种分类算法进行评价和性能比较。结果：CatBoost在二元分类中表现出最佳的综合性能，在内部验证集上具有较高的召回率（0.951）、平衡准确率（0.928）和最低的Brier评分（0.049）；在外部验证集上保持较强的性能，召回率为0.950，平衡准确率为0.913，f1评分为0.951。多类分类更具挑战性，Gradient Boosting成为顶级模型，在外部验证集上获得了最高的召回率（0.666）和精度（0.663），具有较强的f1得分（0.664）和合理的校准（Brier Score = 0.223）。结论：机器学习算法显示出检测当前脆弱状态的希望，特别是在二进制分类中。然而，区分脆弱子类别仍然具有挑战性，需要改进模型和特征选择策略来提高多类分类的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊