High-Throughput Phenotyping of the Symptoms of Alzheimer Disease and Related Dementias Using Large Language Models: Cross-Sectional Study.

IF 2

JMIR AI Pub Date : 2025-06-03 DOI:10.2196/66926

You Cheng, Mrunal Malekar, Yingnan He, Apoorva Bommareddy, Colin Magdamo, Arjun Singh, Brandon Westover, Shibani S Mukerji, John Dickson, Sudeshna Das

{"title":"High-Throughput Phenotyping of the Symptoms of Alzheimer Disease and Related Dementias Using Large Language Models: Cross-Sectional Study.","authors":"You Cheng, Mrunal Malekar, Yingnan He, Apoorva Bommareddy, Colin Magdamo, Arjun Singh, Brandon Westover, Shibani S Mukerji, John Dickson, Sudeshna Das","doi":"10.2196/66926","DOIUrl":null,"url":null,"abstract":"Background: Alzheimer disease and related dementias (ADRD) are complex disorders with overlapping symptoms and pathologies. Comprehensive records of symptoms in electronic health records (EHRs) are critical for not only reaching an accurate diagnosis but also supporting ongoing research studies and clinical trials. However, these symptoms are frequently obscured within unstructured clinical notes in EHRs, making manual extraction both time-consuming and labor-intensive.Objective: We aimed to automate symptom extraction from the clinical notes of patients with ADRD using fine-tuned large language models (LLMs), compare its performance to regular expression-based symptom recognition, and validate the results using brain magnetic resonance imaging (MRI) data.Methods: We fine-tuned LLMs to extract ADRD symptoms across the following 7 domains: memory, executive function, motor, language, visuospatial, neuropsychiatric, and sleep. We assessed the algorithm's performance by calculating the area under the receiver operating characteristic curve (AUROC) for each domain. The extracted symptoms were then validated in two analyses: (1) predicting ADRD diagnosis using the counts of extracted symptoms and (2) examining the association between ADRD symptoms and MRI-derived brain volumes.Results: Symptom extraction across the 7 domains achieved high accuracy with AUROCs ranging from 0.97 to 0.99. Using the counts of extracted symptoms to predict ADRD diagnosis yielded an AUROC of 0.83 (95% CI 0.77-0.89). Symptom associations with brain volumes revealed that a smaller hippocampal volume was linked to memory impairments (odds ratio 0.62, 95% CI 0.46-0.84; P=.006), and reduced pallidum size was associated with motor impairments (odds ratio 0.73, 95% CI 0.58-0.90; P=.04).Conclusions: These results highlight the accuracy and reliability of our high-throughput ADRD phenotyping algorithm. By enabling automated symptom extraction, our approach has the potential to assist with differential diagnosis, as well as facilitate clinical trials and research studies of dementia.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e66926"},"PeriodicalIF":2.0000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12174885/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/66926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Alzheimer disease and related dementias (ADRD) are complex disorders with overlapping symptoms and pathologies. Comprehensive records of symptoms in electronic health records (EHRs) are critical for not only reaching an accurate diagnosis but also supporting ongoing research studies and clinical trials. However, these symptoms are frequently obscured within unstructured clinical notes in EHRs, making manual extraction both time-consuming and labor-intensive.

Objective: We aimed to automate symptom extraction from the clinical notes of patients with ADRD using fine-tuned large language models (LLMs), compare its performance to regular expression-based symptom recognition, and validate the results using brain magnetic resonance imaging (MRI) data.

Methods: We fine-tuned LLMs to extract ADRD symptoms across the following 7 domains: memory, executive function, motor, language, visuospatial, neuropsychiatric, and sleep. We assessed the algorithm's performance by calculating the area under the receiver operating characteristic curve (AUROC) for each domain. The extracted symptoms were then validated in two analyses: (1) predicting ADRD diagnosis using the counts of extracted symptoms and (2) examining the association between ADRD symptoms and MRI-derived brain volumes.

Results: Symptom extraction across the 7 domains achieved high accuracy with AUROCs ranging from 0.97 to 0.99. Using the counts of extracted symptoms to predict ADRD diagnosis yielded an AUROC of 0.83 (95% CI 0.77-0.89). Symptom associations with brain volumes revealed that a smaller hippocampal volume was linked to memory impairments (odds ratio 0.62, 95% CI 0.46-0.84; P=.006), and reduced pallidum size was associated with motor impairments (odds ratio 0.73, 95% CI 0.58-0.90; P=.04).

Conclusions: These results highlight the accuracy and reliability of our high-throughput ADRD phenotyping algorithm. By enabling automated symptom extraction, our approach has the potential to assist with differential diagnosis, as well as facilitate clinical trials and research studies of dementia.

Abstract Image

查看原文本刊更多论文

使用大语言模型的阿尔茨海默病和相关痴呆症状的高通量表型：横断面研究

背景：阿尔茨海默病及相关痴呆（ADRD）是一种症状和病理重叠的复杂疾病。在电子健康记录（EHRs）中对症状进行全面记录不仅对准确诊断至关重要，而且对正在进行的研究和临床试验也至关重要。然而，这些症状经常在电子病历的非结构化临床记录中被掩盖，使得人工提取既耗时又费力。目的：利用微调的大语言模型（LLMs）从ADRD患者的临床记录中自动提取症状，将其性能与基于正则表达式的症状识别进行比较，并使用脑磁共振成像（MRI）数据验证结果。方法：我们对llm进行了微调，以提取以下7个领域的ADRD症状：记忆、执行功能、运动、语言、视觉空间、神经精神和睡眠。我们通过计算每个域的接收者工作特征曲线（AUROC）下的面积来评估算法的性能。然后在两个分析中验证提取的症状：(1)使用提取的症状计数预测ADRD诊断；(2)检查ADRD症状与mri衍生脑容量之间的关系。结果：7个领域的症状提取具有较高的准确率，auroc范围为0.97 ~ 0.99。使用提取症状的计数来预测ADRD诊断的AUROC为0.83 （95% CI 0.77-0.89）。症状与脑容量的关联表明，较小的海马体积与记忆障碍有关(优势比0.62,95% CI 0.46-0.84；P= 0.006)，白质体积减小与运动障碍相关(优势比0.73,95% CI 0.58-0.90；P = .04点)。结论：这些结果突出了我们的高通量ADRD表型算法的准确性和可靠性。通过实现自动症状提取，我们的方法有可能协助鉴别诊断，以及促进痴呆的临床试验和研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR AI

自引率

0.00%

发文量