FRELSA: A dataset for frailty in elderly people originated from ELSA and evaluated through machine learning models

IF 3.7 2区 医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Matteo Leghissa, Álvaro Carrera, Carlos Á. Iglesias
{"title":"FRELSA: A dataset for frailty in elderly people originated from ELSA and evaluated through machine learning models","authors":"Matteo Leghissa,&nbsp;Álvaro Carrera,&nbsp;Carlos Á. Iglesias","doi":"10.1016/j.ijmedinf.2024.105603","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Frailty is an age-related syndrome characterized by loss of strength and exhaustion and associated with multi-morbidity. Early detection and prediction of the appearance of frailty could help older people age better and prevent them from needing invasive and expensive treatments. Machine learning techniques show promising results in creating a medical support tool for such a task.</p></div><div><h3>Methods</h3><p>This study aims to create a dataset for machine learning-based frailty studies, using Fried's Frailty Phenotype definition. Starting from a longitudinal study on aging in the UK population, we defined a frailty label for each subject. We evaluated the definition by training seven different models for detecting frailty with data that were contemporary to the ones used for the definition. We then integrated more data from two years before to obtain prediction models with a 24-month horizon. Features selection was performed using the MultiSURF algorithm, which ranks all features in order of relevance to the detection or prediction task.</p></div><div><h3>Results</h3><p>We present a new frailty dataset of 5303 subjects and more than 6500 available features. It is publicly available, provided one has access to the original English Longitudinal Study of Ageing dataset. The dataset is balanced after grouping frailty with pre-frailty, and it is suitable for multiclass or binary classification and prediction problems. The seven tested architectures performed similarly, forming a solid baseline that can be improved with future work. Linear regression achieved the best F-score and AUROC in detection and prediction tasks.</p></div><div><h3>Conclusions</h3><p>Creating new frailty-annotated datasets of this size is necessary to develop and improve the frailty prediction techniques. We have shown that our dataset can be used to study and test machine learning models to detect and predict frailty. Future work should improve models' architecture and performance, consider explainability, and possibly enrich the dataset with older waves.</p></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1386505624002661/pdfft?md5=9df88b7adefbcc2789a1bdabf89eed8b&pid=1-s2.0-S1386505624002661-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624002661","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Frailty is an age-related syndrome characterized by loss of strength and exhaustion and associated with multi-morbidity. Early detection and prediction of the appearance of frailty could help older people age better and prevent them from needing invasive and expensive treatments. Machine learning techniques show promising results in creating a medical support tool for such a task.

Methods

This study aims to create a dataset for machine learning-based frailty studies, using Fried's Frailty Phenotype definition. Starting from a longitudinal study on aging in the UK population, we defined a frailty label for each subject. We evaluated the definition by training seven different models for detecting frailty with data that were contemporary to the ones used for the definition. We then integrated more data from two years before to obtain prediction models with a 24-month horizon. Features selection was performed using the MultiSURF algorithm, which ranks all features in order of relevance to the detection or prediction task.

Results

We present a new frailty dataset of 5303 subjects and more than 6500 available features. It is publicly available, provided one has access to the original English Longitudinal Study of Ageing dataset. The dataset is balanced after grouping frailty with pre-frailty, and it is suitable for multiclass or binary classification and prediction problems. The seven tested architectures performed similarly, forming a solid baseline that can be improved with future work. Linear regression achieved the best F-score and AUROC in detection and prediction tasks.

Conclusions

Creating new frailty-annotated datasets of this size is necessary to develop and improve the frailty prediction techniques. We have shown that our dataset can be used to study and test machine learning models to detect and predict frailty. Future work should improve models' architecture and performance, consider explainability, and possibly enrich the dataset with older waves.

FRELSA:源自 ELSA 的老年人虚弱数据集,通过机器学习模型进行评估
背景虚弱是一种与年龄有关的综合征,其特点是体力下降和精疲力竭,并伴有多种疾病。早期检测和预测虚弱的出现可以帮助老年人更好地安享晚年,避免他们需要接受昂贵的侵入性治疗。机器学习技术在为此类任务创建医疗支持工具方面取得了可喜的成果。本研究旨在利用弗里德的虚弱表型定义,为基于机器学习的虚弱研究创建一个数据集。从英国人口老龄化纵向研究开始,我们为每个受试者定义了一个虚弱标签。我们使用与定义所使用的数据类似的数据训练了七个不同的虚弱检测模型,对定义进行了评估。然后,我们整合了两年前的更多数据,得到了 24 个月的预测模型。特征选择采用 MultiSURF 算法,该算法将所有特征按照与检测或预测任务的相关性进行排序。只要能访问原始的英国老龄化纵向研究数据集,就能公开获得该数据集。该数据集在将虚弱与前期虚弱分组后达到了平衡,适用于多类或二元分类和预测问题。七个经过测试的架构表现类似,形成了一个坚实的基线,可以在今后的工作中加以改进。线性回归在检测和预测任务中取得了最佳的 F 分数和 AUROC。我们已经证明,我们的数据集可用于研究和测试检测和预测虚弱的机器学习模型。未来的工作应该改进模型的结构和性能,考虑可解释性,并在可能的情况下用更老的波来丰富数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Medical Informatics
International Journal of Medical Informatics 医学-计算机:信息系统
CiteScore
8.90
自引率
4.10%
发文量
217
审稿时长
42 days
期刊介绍: International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信