Artificial intelligence-driven natural language processing for identifying linguistic patterns in Alzheimer's disease and mild cognitive impairment: A study of lexical, syntactic, and cohesive features of speech through picture description tasks.
{"title":"Artificial intelligence-driven natural language processing for identifying linguistic patterns in Alzheimer's disease and mild cognitive impairment: A study of lexical, syntactic, and cohesive features of speech through picture description tasks.","authors":"Cynthia A Nyongesa, Mike Hogarth, Judy Pa","doi":"10.1177/13872877251339756","DOIUrl":null,"url":null,"abstract":"<p><p>BackgroundLanguage deficits often occur early in the neurodegenerative process, yet traditional methods frequently fail to detect subtle changes. Natural language processing (NLP) offers a novel approach to identifying linguistic patterns associated with cognitive impairment.ObjectiveWe aimed to analyze linguistic features that differentiate cognitively unimpaired (CU), mild cognitive impairment (MCI), and Alzheimer's disease (AD) groups.MethodsData was extracted from picture description tasks performed by 336 participants in the DementiaBank datasets. 53 linguistic features aggregated into 4 categories: lexical, structural, syntactic, and discourse domains, were identified using NLP toolkits. With normal diagnostic cutoffs, cognitive function was evaluated with the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA).ResultsWith age and education as covariates, ANOVA and post-hoc Tukey's HSD tests revealed that linguistic features such as pronoun usage, syntactic complexity, and lexical sophistication showed significant differences between CU, MCI, and AD groups (p < 0.05). Notably, past tense and personal references were higher in AD than both CU and MCI (p < 0.001), while pronoun usage differed between AD and CU (p < 0.0001). Correlations indicated that higher pronoun rates and lower syntactic complexity were associated with lower MMSE scores and although some features like conjunctions and determiners approached significance, they lacked consistent differentiation.ConclusionsWith the growing adoption of artificial intelligence (AI)-based scribing, these results emphasize the potential of targeted linguistic analysis as a digital biomarker to enable continuous screening for cognitive impairment.</p>","PeriodicalId":14929,"journal":{"name":"Journal of Alzheimer's Disease","volume":" ","pages":"13872877251339756"},"PeriodicalIF":3.4000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Alzheimer's Disease","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/13872877251339756","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
BackgroundLanguage deficits often occur early in the neurodegenerative process, yet traditional methods frequently fail to detect subtle changes. Natural language processing (NLP) offers a novel approach to identifying linguistic patterns associated with cognitive impairment.ObjectiveWe aimed to analyze linguistic features that differentiate cognitively unimpaired (CU), mild cognitive impairment (MCI), and Alzheimer's disease (AD) groups.MethodsData was extracted from picture description tasks performed by 336 participants in the DementiaBank datasets. 53 linguistic features aggregated into 4 categories: lexical, structural, syntactic, and discourse domains, were identified using NLP toolkits. With normal diagnostic cutoffs, cognitive function was evaluated with the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA).ResultsWith age and education as covariates, ANOVA and post-hoc Tukey's HSD tests revealed that linguistic features such as pronoun usage, syntactic complexity, and lexical sophistication showed significant differences between CU, MCI, and AD groups (p < 0.05). Notably, past tense and personal references were higher in AD than both CU and MCI (p < 0.001), while pronoun usage differed between AD and CU (p < 0.0001). Correlations indicated that higher pronoun rates and lower syntactic complexity were associated with lower MMSE scores and although some features like conjunctions and determiners approached significance, they lacked consistent differentiation.ConclusionsWith the growing adoption of artificial intelligence (AI)-based scribing, these results emphasize the potential of targeted linguistic analysis as a digital biomarker to enable continuous screening for cognitive impairment.
期刊介绍:
The Journal of Alzheimer''s Disease (JAD) is an international multidisciplinary journal to facilitate progress in understanding the etiology, pathogenesis, epidemiology, genetics, behavior, treatment and psychology of Alzheimer''s disease. The journal publishes research reports, reviews, short communications, hypotheses, ethics reviews, book reviews, and letters-to-the-editor. The journal is dedicated to providing an open forum for original research that will expedite our fundamental understanding of Alzheimer''s disease.