Effective depression detection and interpretation: Integrating machine learning, deep learning, language models, and explainable AI

IF 2.3 Q2 COMPUTER SCIENCE, THEORY & METHODS

Array Pub Date : 2025-02-07 DOI:10.1016/j.array.2025.100375

Gazi Hasan Al Masud , Rejaul Islam Shanto , Ishmam Sakin , Muhammad Rafsan Kabir

{"title":"Effective depression detection and interpretation: Integrating machine learning, deep learning, language models, and explainable AI","authors":"Gazi Hasan Al Masud , Rejaul Islam Shanto , Ishmam Sakin , Muhammad Rafsan Kabir","doi":"10.1016/j.array.2025.100375","DOIUrl":null,"url":null,"abstract":"<div><div>Depression is an increasingly prevalent issue, particularly among young people, significantly impacting their well-being and causing persistent distress. Early detection is crucial to address this growing concern. This study utilizes various machine learning, deep learning, and language models to detect depression among Bangladeshi university students. To address data imbalance in the employed dataset, resampling techniques such as SMOTE and Cluster Centroids are applied. Additionally, exhaustive hyperparameter optimization is performed to enhance classification performance. Our results indicate that machine learning algorithms, particularly Random Forest, effectively predict depression with an accuracy of 91.1% and an F1-score of 91.6%. Language models like RoBERTa also achieve strong results, with a recall score of 98.6%. Moreover, explainable AI (XAI) methods, including SHAP and LIME, are employed to interpret model predictions, underscoring the importance of transparency in machine learning. This work contributes to the early identification of depression by integrating machine learning, deep learning, natural language processing, and XAI techniques. While this study focuses on Bangladeshi or similar demographic groups, the proposed approaches are adaptable and can be applied to other populations for generalization.</div></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"25 ","pages":"Article 100375"},"PeriodicalIF":2.3000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005625000025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Depression is an increasingly prevalent issue, particularly among young people, significantly impacting their well-being and causing persistent distress. Early detection is crucial to address this growing concern. This study utilizes various machine learning, deep learning, and language models to detect depression among Bangladeshi university students. To address data imbalance in the employed dataset, resampling techniques such as SMOTE and Cluster Centroids are applied. Additionally, exhaustive hyperparameter optimization is performed to enhance classification performance. Our results indicate that machine learning algorithms, particularly Random Forest, effectively predict depression with an accuracy of 91.1% and an F1-score of 91.6%. Language models like RoBERTa also achieve strong results, with a recall score of 98.6%. Moreover, explainable AI (XAI) methods, including SHAP and LIME, are employed to interpret model predictions, underscoring the importance of transparency in machine learning. This work contributes to the early identification of depression by integrating machine learning, deep learning, natural language processing, and XAI techniques. While this study focuses on Bangladeshi or similar demographic groups, the proposed approaches are adaptable and can be applied to other populations for generalization.

查看原文本刊更多论文

有效的抑郁症检测和解释：整合机器学习、深度学习、语言模型和可解释的人工智能

抑郁症是一个日益普遍的问题，特别是在年轻人中，严重影响他们的福祉并造成持续的痛苦。早期发现对于解决这一日益严重的问题至关重要。本研究利用各种机器学习、深度学习和语言模型来检测孟加拉国大学生的抑郁症。为了解决所使用数据集中的数据不平衡，应用了SMOTE和聚类质心等重采样技术。此外，还进行了穷举超参数优化，以提高分类性能。我们的研究结果表明，机器学习算法，特别是随机森林，可以有效地预测抑郁症，准确率为91.1%，f1得分为91.6%。像RoBERTa这样的语言模型也取得了很好的成绩，召回率达到了98.6%。此外，可解释的人工智能（XAI）方法，包括SHAP和LIME，被用来解释模型预测，强调了透明度在机器学习中的重要性。这项工作通过整合机器学习、深度学习、自然语言处理和XAI技术，有助于抑郁症的早期识别。虽然这项研究的重点是孟加拉国或类似的人口群体，但提议的方法具有适应性，可以适用于其他人群，以便推广。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊