MDD-LLM: Towards accuracy large language models for major depressive disorder diagnosis

IF 4.9 2区医学 Q1 CLINICAL NEUROLOGY

Journal of affective disorders Pub Date : 2025-06-26 DOI:10.1016/j.jad.2025.119774

Yuyang Sha , Hongxin Pan , Wei Xu , Weiyu Meng , Gang Luo , Xinyu Du , Xiaobing Zhai , Henry H.Y. Tong , Caijuan Shi , Kefeng Li

{"title":"MDD-LLM: Towards accuracy large language models for major depressive disorder diagnosis","authors":"Yuyang Sha , Hongxin Pan , Wei Xu , Weiyu Meng , Gang Luo , Xinyu Du , Xiaobing Zhai , Henry H.Y. Tong , Caijuan Shi , Kefeng Li","doi":"10.1016/j.jad.2025.119774","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions.</div></div><div><h3>Methods</h3><div>This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics.</div></div><div><h3>Results</h3><div>Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons.</div></div><div><h3>Conclusion</h3><div>This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.</div></div>","PeriodicalId":14963,"journal":{"name":"Journal of affective disorders","volume":"388 ","pages":"Article 119774"},"PeriodicalIF":4.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of affective disorders","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165032725012169","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions.

Methods

This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics.

Results

Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons.

Conclusion

This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.

查看原文本刊更多论文

MDD-LLM：面向重度抑郁症诊断准确性的大语言模型

重度抑郁症（MDD）影响全球3亿人，突出了一个重大的公共卫生问题。然而，由于医疗资源分布不均和诊断方法复杂，导致许多国家和地区对这一疾病的重视不足。方法介绍了一种名为MDD- llm的高性能MDD诊断工具，这是一个人工智能驱动的框架，利用微调的大语言模型（llm）和广泛的现实世界样本来解决MDD诊断中的挑战。具体来说，我们从UK Biobank队列中选择了274,348条个人记录，并设计了三种表格数据转换方法，以创建一个大型语料库来训练和评估所提出的方法。为了说明MDD-LLM的优势，我们进行了全面的实验，并对跨多个评估指标的现有基于模型的解决方案进行了一些比较分析。结果实验结果表明，MDD- llm （70B）的准确率为0.8378，AUC为0.8919 (95% CI: 0.8799-0.9040)，显著优于现有的机器和深度学习框架用于MDD诊断。鉴于llm在MDD诊断中的探索有限，我们研究了许多可能影响我们提出的方法性能的因素，包括表格数据转换技术和不同的微调策略。此外，我们还分析了模型的可解释性，要求MDD-LLM解释其预测并提供相应的原因。结论探讨了llm和大规模训练样本在MDD诊断中的应用。研究结果表明，与传统的基于模型的解决方案相比，llms驱动的方案在MDD诊断中的准确性、稳健性和可解释性方面具有显著的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of affective disorders 医学-精神病学

CiteScore

10.90

自引率

6.10%

发文量

1319

审稿时长

9.3 weeks

期刊介绍： The Journal of Affective Disorders publishes papers concerned with affective disorders in the widest sense: depression, mania, mood spectrum, emotions and personality, anxiety and stress. It is interdisciplinary and aims to bring together different approaches for a diverse readership. Top quality papers will be accepted dealing with any aspect of affective disorders, including neuroimaging, cognitive neurosciences, genetics, molecular biology, experimental and clinical neurosciences, pharmacology, neuroimmunoendocrinology, intervention and treatment trials.