Multimodal LLM for enhanced Alzheimer’s Disease diagnosis: Interpretable feature extraction from Mini-Mental State Examination data

IF 4.3

Experimental gerontology Pub Date : 2025-07-03 DOI:10.1016/j.exger.2025.112812

Meiwei Zhang , Yuwei Pan , Qiushi Cui , Yang Lü , Weihua Yu

{"title":"Multimodal LLM for enhanced Alzheimer’s Disease diagnosis: Interpretable feature extraction from Mini-Mental State Examination data","authors":"Meiwei Zhang , Yuwei Pan , Qiushi Cui , Yang Lü , Weihua Yu","doi":"10.1016/j.exger.2025.112812","DOIUrl":null,"url":null,"abstract":"<div><div>Alzheimer’s Disease (AD) poses a considerable global health challenge, necessitating early and accurate diagnostics. The Mini-Mental State Examination (MMSE) is widely used for initial screening, but its traditional application often underutilizes the rich multimodal data it generates, such as videos, images, and speech. Integrating these modalities with modern Large Language Models (LLMs) offers untapped potential for improved diagnostics. In this study, we propose a multimodal LLM framework fundamentally reinterprets MMSE data. Instead of relying on conventional, often limited MMSE features, proposed LLM acts as a sophisticated cognitive analyst, directly processing MMSE modalities. This deep multimodal understanding allows for the extraction of novel, high-level features that transcend traditional metrics. These are not merely visual or acoustic signals, but rich semantic representations imbued with cognitive insights gleaned by the LLM. We then construct an interpretable decision tree classifier and derive a succinct rule list, yielding transparent diagnostic pathways readily understandable by clinicians. Finally, framework integrates a counterfactual explanation module to provide individualized “what-if” analyses, illuminating how minimal feature changes could alter model outputs. Our empirical study on real-world clinical data achieves a diagnostic accuracy of approximately 6% percentage points improvements with diagnosing explanation, reinforcing the viability of our framework as a promising, interpretable, and scalable solution for early AD detection.</div></div>","PeriodicalId":94003,"journal":{"name":"Experimental gerontology","volume":"208 ","pages":"Article 112812"},"PeriodicalIF":4.3000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental gerontology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S053155652500141X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Alzheimer’s Disease (AD) poses a considerable global health challenge, necessitating early and accurate diagnostics. The Mini-Mental State Examination (MMSE) is widely used for initial screening, but its traditional application often underutilizes the rich multimodal data it generates, such as videos, images, and speech. Integrating these modalities with modern Large Language Models (LLMs) offers untapped potential for improved diagnostics. In this study, we propose a multimodal LLM framework fundamentally reinterprets MMSE data. Instead of relying on conventional, often limited MMSE features, proposed LLM acts as a sophisticated cognitive analyst, directly processing MMSE modalities. This deep multimodal understanding allows for the extraction of novel, high-level features that transcend traditional metrics. These are not merely visual or acoustic signals, but rich semantic representations imbued with cognitive insights gleaned by the LLM. We then construct an interpretable decision tree classifier and derive a succinct rule list, yielding transparent diagnostic pathways readily understandable by clinicians. Finally, framework integrates a counterfactual explanation module to provide individualized “what-if” analyses, illuminating how minimal feature changes could alter model outputs. Our empirical study on real-world clinical data achieves a diagnostic accuracy of approximately 6% percentage points improvements with diagnosing explanation, reinforcing the viability of our framework as a promising, interpretable, and scalable solution for early AD detection.

查看原文本刊更多论文

用于增强阿尔茨海默病诊断的多模式LLM：从迷你精神状态检查数据中提取可解释的特征

阿尔茨海默病（AD）是一个相当大的全球健康挑战，需要早期和准确的诊断。迷你精神状态检查（MMSE）被广泛用于初始筛查，但其传统应用往往没有充分利用其生成的丰富的多模态数据，如视频、图像和语音。将这些模式与现代大型语言模型（llm）集成，为改进诊断提供了尚未开发的潜力。在本研究中，我们提出了一个多模态LLM框架，从根本上重新解释MMSE数据。与传统的、通常有限的MMSE特征不同，LLM作为一个复杂的认知分析师，直接处理MMSE模式。这种深入的多模态理解允许提取超越传统度量的新颖、高级特征。这些不仅仅是视觉或听觉信号，而是丰富的语义表示，充满了LLM收集的认知见解。然后，我们构建一个可解释的决策树分类器，并得出一个简洁的规则列表，产生透明的诊断途径，易于临床医生理解。最后，框架集成了一个反事实解释模块，以提供个性化的“假设”分析，说明最小的特征变化如何改变模型输出。通过对真实世界临床数据的实证研究，通过诊断解释，我们的诊断准确率提高了约6%，这加强了我们的框架作为早期AD检测的有前途、可解释和可扩展的解决方案的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊