Yuyang Sha , Hongxin Pan , Wei Xu , Weiyu Meng , Gang Luo , Xinyu Du , Xiaobing Zhai , Henry H.Y. Tong , Caijuan Shi , Kefeng Li
{"title":"MDD-LLM: Towards accuracy large language models for major depressive disorder diagnosis","authors":"Yuyang Sha , Hongxin Pan , Wei Xu , Weiyu Meng , Gang Luo , Xinyu Du , Xiaobing Zhai , Henry H.Y. Tong , Caijuan Shi , Kefeng Li","doi":"10.1016/j.jad.2025.119774","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions.</div></div><div><h3>Methods</h3><div>This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics.</div></div><div><h3>Results</h3><div>Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons.</div></div><div><h3>Conclusion</h3><div>This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.</div></div>","PeriodicalId":14963,"journal":{"name":"Journal of affective disorders","volume":"388 ","pages":"Article 119774"},"PeriodicalIF":4.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of affective disorders","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165032725012169","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Major depressive disorder (MDD) impacts >300 million individuals worldwide, highlighting a significant public health issue. However, the uneven distribution of medical resources and the complexity of diagnostic methods have resulted in inadequate attention to this disorder in numerous countries and regions.
Methods
This paper introduces a high-performance MDD diagnosis tool named MDD-LLM, an AI-driven framework that utilizes fine-tuned large language models (LLMs) and extensive real-world samples to tackle challenges in MDD diagnosis. Specifically, we select 274,348 individual records from the UK Biobank cohort and design three tabular data transformation methods to create a large corpus for training and evaluating the proposed method. To illustrate the advantages of MDD-LLM, we perform comprehensive experiments and provide several comparative analyses against existing model-based solutions across multiple evaluation metrics.
Results
Experimental results show that MDD-LLM (70B) achieves an accuracy of 0.8378 and an AUC of 0.8919 (95 % CI: 0.8799–0.9040), significantly outperforming existing machine and deep learning frameworks for MDD diagnosis. Given the limited exploration of LLMs in MDD diagnosis, we examine numerous factors that may influence the performance of our proposed method, including tabular data transformation techniques and different fine-tuning strategies. Furthermore, we also analyze the model's interpretability, requiring the MDD-LLM to explain its predictions and provide corresponding reasons.
Conclusion
This paper investigates the application of LLMs and large-scale training samples for diagnosing MDD. The findings indicate that LLMs-driven schemes offer significant potential for accuracy, robustness, and interpretability in MDD diagnosis compared to traditional model-based solutions.
期刊介绍:
The Journal of Affective Disorders publishes papers concerned with affective disorders in the widest sense: depression, mania, mood spectrum, emotions and personality, anxiety and stress. It is interdisciplinary and aims to bring together different approaches for a diverse readership. Top quality papers will be accepted dealing with any aspect of affective disorders, including neuroimaging, cognitive neurosciences, genetics, molecular biology, experimental and clinical neurosciences, pharmacology, neuroimmunoendocrinology, intervention and treatment trials.