Phenotyping to predict 12-month health outcomes of older general medicine patients

IF 3.4 3区医学 Q2 GERIATRICS & GERONTOLOGY

Aging Clinical and Experimental Research Pub Date : 2025-02-22 DOI:10.1007/s40520-024-02924-2

Richard John Woodman, Kimberly Bryant, Michael J. Sorich, Campbell H. Thompson, Patrick Russell, Alberto Pilotto, Aleksander A. Mangoni

{"title":"Phenotyping to predict 12-month health outcomes of older general medicine patients","authors":"Richard John Woodman, Kimberly Bryant, Michael J. Sorich, Campbell H. Thompson, Patrick Russell, Alberto Pilotto, Aleksander A. Mangoni","doi":"10.1007/s40520-024-02924-2","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>A variety of unsupervised learning algorithms have been used to phenotype older patients, enabling directed care and personalised treatment plans. However, the ability of the clusters to accurately discriminate for the risk of older patients, may vary depending on the methods employed.</p><h3>Aims</h3><p>To compare seven clustering algorithms in their ability to develop patient phenotypes that accurately predict health outcomes.</p><h3>Methods</h3><p>Data was collected for <i>N</i> = 737 older medical inpatients during their hospital stay for five different types of medical data (ICD-10 codes, ATC drug codes, laboratory, clinic and frailty data). We trialled five unsupervised learning algorithms (K-means, K-modes, hierarchical clustering, latent class analysis (LCA), and DBSCAN) and two graph-based approaches to create separate clusters for each method and datatype. These were used as input for a random forest classifier to predict eleven health outcomes: mortality at one, three, six and 12 months, in-hospital falls and delirium, length-of-stay, outpatient visits, and readmissions at one, three and six months.</p><h3>Results</h3><p>The overall median area-under-the-curve (AUC) across the eleven outcomes for the seven methods were (from highest to lowest) 0.758 (hierarchical), 0.739 (K-means), 0.722 (KG-Louvain), 0.704 (KNN-Louvain), 0.698 (LCA), 0.694 (DBSCAN) and 0.656 (K-modes). Overall, frailty data was most important data type for predicting mortality, ICD-10 disease codes for predicting readmissions, and laboratory data the most important for predicting falls.</p><h3>Conclusions</h3><p>Clusters created using hierarchical, K-means and Louvain community detection algorithms identified well-separated patient phenotypes that were consistently associated with age-related adverse health outcomes. Frailty data was the most valuable data type for predicting most health outcomes.</p></div>","PeriodicalId":7720,"journal":{"name":"Aging Clinical and Experimental Research","volume":"37 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40520-024-02924-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aging Clinical and Experimental Research","FirstCategoryId":"3","ListUrlMain":"https://link.springer.com/article/10.1007/s40520-024-02924-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GERIATRICS & GERONTOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

A variety of unsupervised learning algorithms have been used to phenotype older patients, enabling directed care and personalised treatment plans. However, the ability of the clusters to accurately discriminate for the risk of older patients, may vary depending on the methods employed.

Aims

To compare seven clustering algorithms in their ability to develop patient phenotypes that accurately predict health outcomes.

Methods

Data was collected for N = 737 older medical inpatients during their hospital stay for five different types of medical data (ICD-10 codes, ATC drug codes, laboratory, clinic and frailty data). We trialled five unsupervised learning algorithms (K-means, K-modes, hierarchical clustering, latent class analysis (LCA), and DBSCAN) and two graph-based approaches to create separate clusters for each method and datatype. These were used as input for a random forest classifier to predict eleven health outcomes: mortality at one, three, six and 12 months, in-hospital falls and delirium, length-of-stay, outpatient visits, and readmissions at one, three and six months.

Results

The overall median area-under-the-curve (AUC) across the eleven outcomes for the seven methods were (from highest to lowest) 0.758 (hierarchical), 0.739 (K-means), 0.722 (KG-Louvain), 0.704 (KNN-Louvain), 0.698 (LCA), 0.694 (DBSCAN) and 0.656 (K-modes). Overall, frailty data was most important data type for predicting mortality, ICD-10 disease codes for predicting readmissions, and laboratory data the most important for predicting falls.

Conclusions

Clusters created using hierarchical, K-means and Louvain community detection algorithms identified well-separated patient phenotypes that were consistently associated with age-related adverse health outcomes. Frailty data was the most valuable data type for predicting most health outcomes.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Aging Clinical and Experimental Research 医学-老年医学

CiteScore

7.90

自引率

5.00%

发文量

283

审稿时长

1 months

期刊介绍： Aging clinical and experimental research offers a multidisciplinary forum on the progressing field of gerontology and geriatrics. The areas covered by the journal include: biogerontology, neurosciences, epidemiology, clinical gerontology and geriatric assessment, social, economical and behavioral gerontology. “Aging clinical and experimental research” appears bimonthly and publishes review articles, original papers and case reports.