96. 利用机器学习模型通过语音分析检测早期阿尔茨海默病

IF 3.8 2区医学 Q1 GERIATRICS & GERONTOLOGY

American Journal of Geriatric Psychiatry Pub Date : 2025-07-14 DOI:10.1016/j.jagp.2025.04.098

Julia Kimball , Ashley Abi Chaker , Alp Canbulat , Ipsit Vahia

{"title":"96. 利用机器学习模型通过语音分析检测早期阿尔茨海默病","authors":"Julia Kimball , Ashley Abi Chaker , Alp Canbulat , Ipsit Vahia","doi":"10.1016/j.jagp.2025.04.098","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>There is an urgent need for novel approaches that may facilitate early detection of Alzheimer's disease and thus, create targets for effective intervention and management. Current diagnostic methods often rely on expensive and/or time-consuming procedures such as brain imaging and cognitive assessments. A novel approach proposes leveraging AI/ML models to detect AD early through the analysis of spontaneous speech and language use. This method holds the potential to advance the process of AD diagnosis by offering a non-invasive, cost-effective, and easily accessible screening tool that may identify subtle variations in linguistic (and by extension, neurocognitive) function that may not yet be identified by standard screening tools. Here, we explore the range of deep learning models that have been applied to language and cognition. We also compare their analytic approaches and available results, with a view to identifying which approach may translate most readily to clinical care.</div></div><div><h3>Methods</h3><div>We used a multi-faceted approach that included a literature review, brainstorming sessions with an interdisciplinary team and field experts, and targeted internet searches for relevant web-based resources. The focus of our search was to compile studies that explored the development and application of AI algorithms to identify subtle changes in speech patterns, linguistic features, and acoustic properties associated with the early stages of AD. We considered, but did not apply a traditional biomedical search algorithm, since the literature in this space is often found outside of the biomedical literature, and because this is an exploratory project. We noted that by analyzing extensive datasets of speech samples from both healthy individuals and those with AD, all the identified studies sought to establish robust predictive models for early detection. We further examined whether confounding variables present in current linguistic AD models, such as those arising from language barriers, are also present in trained deep learning models.</div></div><div><h3>Results</h3><div>Our investigation demonstrated the consistent application across the literature, of a multimodal system, encompassing both neural networks and traditional analysis models, which were fine-tuned for the early detection of Alzheimer's disease. Among these, the ADReSS dataset emerged as the most effective, with the ensemble method achieving the highest accuracy in predicting Alzheimer’s disease based on speech patterns. However, we noted a crucial limitation: the model’s training relied solely on English speech data. This restriction introduces bias and hinders generalizability. Languages exhibit distinct phonetic structures, accents, and rhythms, potentially causing a model trained exclusively on English to misinterpret speech from other languages. Furthermore, while deep neural networks excel at discerning complex patterns, their internal workings often remain opaque, making it challenging to ascertain the precise rationale behind specific predictions.</div></div><div><h3>Conclusions</h3><div>Our review identifies a notable body of literature that outlines a range of deep learning models that have already been applied to identifying cognitive changes through the use of language. With large language models gaining rapid popularity, there is a tremendous opportunity to gather data samples from natural language, and by pairing the right model with the right type of language data, powerful new screening tools may be developed. Our work points to two key areas for future prioritization: 1) developing models trained on diverse languages and 2) expanding existing datasets to encompass a wider range of linguistic variations, including various dialects and demographics. These advancements will contribute to more equitable and reliable speech-based Alzheimer's disease detection tools.</div></div>","PeriodicalId":55534,"journal":{"name":"American Journal of Geriatric Psychiatry","volume":"33 10","pages":"Page S71"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"96. USING MACHINE LEARNING MODELS TO DETECT EARLY ALZHEIMER’S DISEASE THROUGH SPEECH ANALYSIS\",\"authors\":\"Julia Kimball , Ashley Abi Chaker , Alp Canbulat , Ipsit Vahia\",\"doi\":\"10.1016/j.jagp.2025.04.098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><div>There is an urgent need for novel approaches that may facilitate early detection of Alzheimer's disease and thus, create targets for effective intervention and management. Current diagnostic methods often rely on expensive and/or time-consuming procedures such as brain imaging and cognitive assessments. A novel approach proposes leveraging AI/ML models to detect AD early through the analysis of spontaneous speech and language use. This method holds the potential to advance the process of AD diagnosis by offering a non-invasive, cost-effective, and easily accessible screening tool that may identify subtle variations in linguistic (and by extension, neurocognitive) function that may not yet be identified by standard screening tools. Here, we explore the range of deep learning models that have been applied to language and cognition. We also compare their analytic approaches and available results, with a view to identifying which approach may translate most readily to clinical care.</div></div><div><h3>Methods</h3><div>We used a multi-faceted approach that included a literature review, brainstorming sessions with an interdisciplinary team and field experts, and targeted internet searches for relevant web-based resources. The focus of our search was to compile studies that explored the development and application of AI algorithms to identify subtle changes in speech patterns, linguistic features, and acoustic properties associated with the early stages of AD. We considered, but did not apply a traditional biomedical search algorithm, since the literature in this space is often found outside of the biomedical literature, and because this is an exploratory project. We noted that by analyzing extensive datasets of speech samples from both healthy individuals and those with AD, all the identified studies sought to establish robust predictive models for early detection. We further examined whether confounding variables present in current linguistic AD models, such as those arising from language barriers, are also present in trained deep learning models.</div></div><div><h3>Results</h3><div>Our investigation demonstrated the consistent application across the literature, of a multimodal system, encompassing both neural networks and traditional analysis models, which were fine-tuned for the early detection of Alzheimer's disease. Among these, the ADReSS dataset emerged as the most effective, with the ensemble method achieving the highest accuracy in predicting Alzheimer’s disease based on speech patterns. However, we noted a crucial limitation: the model’s training relied solely on English speech data. This restriction introduces bias and hinders generalizability. Languages exhibit distinct phonetic structures, accents, and rhythms, potentially causing a model trained exclusively on English to misinterpret speech from other languages. Furthermore, while deep neural networks excel at discerning complex patterns, their internal workings often remain opaque, making it challenging to ascertain the precise rationale behind specific predictions.</div></div><div><h3>Conclusions</h3><div>Our review identifies a notable body of literature that outlines a range of deep learning models that have already been applied to identifying cognitive changes through the use of language. With large language models gaining rapid popularity, there is a tremendous opportunity to gather data samples from natural language, and by pairing the right model with the right type of language data, powerful new screening tools may be developed. Our work points to two key areas for future prioritization: 1) developing models trained on diverse languages and 2) expanding existing datasets to encompass a wider range of linguistic variations, including various dialects and demographics. These advancements will contribute to more equitable and reliable speech-based Alzheimer's disease detection tools.</div></div>\",\"PeriodicalId\":55534,\"journal\":{\"name\":\"American Journal of Geriatric Psychiatry\",\"volume\":\"33 10\",\"pages\":\"Page S71\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Geriatric Psychiatry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1064748125002088\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GERIATRICS & GERONTOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Geriatric Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1064748125002088","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GERIATRICS & GERONTOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

迫切需要新的方法来促进阿尔茨海默病的早期发现，从而为有效的干预和管理创造目标。目前的诊断方法往往依赖于昂贵和/或耗时的程序，如脑成像和认知评估。一种新颖的方法提出利用AI/ML模型通过分析自发语音和语言使用来早期检测AD。这种方法提供了一种非侵入性的、具有成本效益的、易于获取的筛查工具，可以识别语言（以及神经认知）功能的细微变化，这可能是标准筛查工具尚未识别的，因此具有推进AD诊断过程的潜力。在这里，我们探索了深度学习模型在语言和认知领域的应用范围。我们还比较了他们的分析方法和现有的结果，以确定哪种方法可能最容易转化为临床护理。方法我们采用了多方面的方法，包括文献综述，与跨学科团队和领域专家进行头脑风暴会议，以及有针对性地在互联网上搜索相关的网络资源。我们的研究重点是汇编研究，探索人工智能算法的发展和应用，以识别与阿尔茨海默病早期阶段相关的语音模式、语言特征和声学特性的细微变化。我们考虑了，但没有使用传统的生物医学搜索算法，因为这个空间的文献通常是在生物医学文献之外找到的，而且因为这是一个探索性的项目。我们注意到，通过分析来自健康个体和AD患者的广泛的语言样本数据集，所有确定的研究都试图建立早期检测的可靠预测模型。我们进一步研究了当前语言AD模型中存在的混杂变量，例如由语言障碍引起的变量，是否也存在于训练过的深度学习模型中。结果我们的研究证明了多模式系统在文献中的一致应用，该系统包括神经网络和传统分析模型，这些模型经过微调，可用于阿尔茨海默病的早期检测。其中，address数据集是最有效的，集合方法在基于语音模式预测阿尔茨海默病方面达到了最高的准确性。然而，我们注意到一个关键的限制：模型的训练完全依赖于英语语音数据。这一限制引入了偏见并阻碍了通用性。语言表现出不同的语音结构、口音和节奏，这可能会导致只训练英语的模型误解其他语言的语音。此外，尽管深度神经网络在识别复杂模式方面表现出色，但它们的内部工作原理往往仍然不透明，这使得确定具体预测背后的确切原理变得具有挑战性。我们的回顾确定了一个值得注意的文献体系，这些文献概述了一系列深度学习模型，这些模型已经被应用于识别通过语言使用产生的认知变化。随着大型语言模型的迅速普及，从自然语言中收集数据样本的机会非常大，通过将正确的模型与正确类型的语言数据配对，可以开发出强大的新筛选工具。我们的工作指出了未来优先考虑的两个关键领域：1)开发针对不同语言训练的模型；2)扩展现有数据集，以涵盖更广泛的语言变化，包括各种方言和人口统计数据。这些进步将有助于更公平和可靠的基于言语的阿尔茨海默病检测工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

96. USING MACHINE LEARNING MODELS TO DETECT EARLY ALZHEIMER’S DISEASE THROUGH SPEECH ANALYSIS

Introduction

There is an urgent need for novel approaches that may facilitate early detection of Alzheimer's disease and thus, create targets for effective intervention and management. Current diagnostic methods often rely on expensive and/or time-consuming procedures such as brain imaging and cognitive assessments. A novel approach proposes leveraging AI/ML models to detect AD early through the analysis of spontaneous speech and language use. This method holds the potential to advance the process of AD diagnosis by offering a non-invasive, cost-effective, and easily accessible screening tool that may identify subtle variations in linguistic (and by extension, neurocognitive) function that may not yet be identified by standard screening tools. Here, we explore the range of deep learning models that have been applied to language and cognition. We also compare their analytic approaches and available results, with a view to identifying which approach may translate most readily to clinical care.

Methods

We used a multi-faceted approach that included a literature review, brainstorming sessions with an interdisciplinary team and field experts, and targeted internet searches for relevant web-based resources. The focus of our search was to compile studies that explored the development and application of AI algorithms to identify subtle changes in speech patterns, linguistic features, and acoustic properties associated with the early stages of AD. We considered, but did not apply a traditional biomedical search algorithm, since the literature in this space is often found outside of the biomedical literature, and because this is an exploratory project. We noted that by analyzing extensive datasets of speech samples from both healthy individuals and those with AD, all the identified studies sought to establish robust predictive models for early detection. We further examined whether confounding variables present in current linguistic AD models, such as those arising from language barriers, are also present in trained deep learning models.

Results

Our investigation demonstrated the consistent application across the literature, of a multimodal system, encompassing both neural networks and traditional analysis models, which were fine-tuned for the early detection of Alzheimer's disease. Among these, the ADReSS dataset emerged as the most effective, with the ensemble method achieving the highest accuracy in predicting Alzheimer’s disease based on speech patterns. However, we noted a crucial limitation: the model’s training relied solely on English speech data. This restriction introduces bias and hinders generalizability. Languages exhibit distinct phonetic structures, accents, and rhythms, potentially causing a model trained exclusively on English to misinterpret speech from other languages. Furthermore, while deep neural networks excel at discerning complex patterns, their internal workings often remain opaque, making it challenging to ascertain the precise rationale behind specific predictions.

Conclusions

Our review identifies a notable body of literature that outlines a range of deep learning models that have already been applied to identifying cognitive changes through the use of language. With large language models gaining rapid popularity, there is a tremendous opportunity to gather data samples from natural language, and by pairing the right model with the right type of language data, powerful new screening tools may be developed. Our work points to two key areas for future prioritization: 1) developing models trained on diverse languages and 2) expanding existing datasets to encompass a wider range of linguistic variations, including various dialects and demographics. These advancements will contribute to more equitable and reliable speech-based Alzheimer's disease detection tools.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

American Journal of Geriatric Psychiatry 医学-精神病学

CiteScore

13.00

自引率

4.20%

发文量

381

审稿时长

26 days

期刊介绍： The American Journal of Geriatric Psychiatry is the leading source of information in the rapidly evolving field of geriatric psychiatry. This esteemed journal features peer-reviewed articles covering topics such as the diagnosis and classification of psychiatric disorders in older adults, epidemiological and biological correlates of mental health in the elderly, and psychopharmacology and other somatic treatments. Published twelve times a year, the journal serves as an authoritative resource for professionals in the field.