社交媒体文本多层次抑郁检测的新型混合聚类和分类框架

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-08-16 DOI:10.1016/j.engappai.2025.111952

Parisa Khodabakhshi, Masoud Mahootchi, Hadi Mosadegh

{"title":"社交媒体文本多层次抑郁检测的新型混合聚类和分类框架","authors":"Parisa Khodabakhshi, Masoud Mahootchi, Hadi Mosadegh","doi":"10.1016/j.engappai.2025.111952","DOIUrl":null,"url":null,"abstract":"<div><div>Depression is a prevalent psychiatric condition worldwide, with significant social and economic implications. Despite its high incidence, many individuals with depression remain undiagnosed and untreated. Meanwhile, people increasingly use social media platforms to express their emotions and thoughts. Consequently, leveraging these platforms for depression detection may help address several related challenges. This paper proposes a three-stage methodology, based on text mining techniques, to determine the severity of depression in individuals who post textual content on social media. In the proposed framework, each post is transformed into a vector of numerical features using established feature extraction methods. Principal Component Analysis is then applied to select the most informative features for identifying whether a post indicates depression via a classification algorithm. If depression is detected, the method clusters the relevant posts based on their characteristics, grouping similar texts together. A second classification model is then applied within each cluster to determine the level of depression. To equip the model with effective algorithms at each stage, the Taguchi method is used to identify the best combination of feature extraction, clustering, and classification techniques. Specifically, Bidirectional Encoder Representations from Transformers (BERT) is used for deep contextual feature extraction, Deep Embedded Clustering (DEC) is employed for clustering, and Support Vector Machine (SVM) is used for classification. Numerical results show that the proposed approach can accurately classify individuals’ posts into one of four depression levels: non-depressed, low, moderate, and severe. These findings suggest that social networks offer a platform for assessing mental health through textual analysis.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111952"},"PeriodicalIF":8.0000,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel hybrid clustering and classification framework for multi-level depression detection in social media texts\",\"authors\":\"Parisa Khodabakhshi, Masoud Mahootchi, Hadi Mosadegh\",\"doi\":\"10.1016/j.engappai.2025.111952\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Depression is a prevalent psychiatric condition worldwide, with significant social and economic implications. Despite its high incidence, many individuals with depression remain undiagnosed and untreated. Meanwhile, people increasingly use social media platforms to express their emotions and thoughts. Consequently, leveraging these platforms for depression detection may help address several related challenges. This paper proposes a three-stage methodology, based on text mining techniques, to determine the severity of depression in individuals who post textual content on social media. In the proposed framework, each post is transformed into a vector of numerical features using established feature extraction methods. Principal Component Analysis is then applied to select the most informative features for identifying whether a post indicates depression via a classification algorithm. If depression is detected, the method clusters the relevant posts based on their characteristics, grouping similar texts together. A second classification model is then applied within each cluster to determine the level of depression. To equip the model with effective algorithms at each stage, the Taguchi method is used to identify the best combination of feature extraction, clustering, and classification techniques. Specifically, Bidirectional Encoder Representations from Transformers (BERT) is used for deep contextual feature extraction, Deep Embedded Clustering (DEC) is employed for clustering, and Support Vector Machine (SVM) is used for classification. Numerical results show that the proposed approach can accurately classify individuals’ posts into one of four depression levels: non-depressed, low, moderate, and severe. These findings suggest that social networks offer a platform for assessing mental health through textual analysis.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"160 \",\"pages\":\"Article 111952\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625019608\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625019608","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

抑郁症是世界范围内普遍存在的精神疾病，具有重大的社会和经济影响。尽管发病率很高，但许多抑郁症患者仍未得到诊断和治疗。与此同时，人们越来越多地使用社交媒体平台来表达自己的情感和想法。因此，利用这些平台来检测抑郁症可能有助于解决几个相关的挑战。本文提出了一种基于文本挖掘技术的三阶段方法，以确定在社交媒体上发布文本内容的个人的抑郁严重程度。在所提出的框架中，每个帖子都使用已建立的特征提取方法转换为数值特征向量。然后应用主成分分析来选择最具信息量的特征，通过分类算法来识别帖子是否表明抑郁。如果检测到抑郁，该方法根据其特征将相关帖子聚类，将相似的文本分组在一起。然后在每个集群中应用第二个分类模型来确定抑郁程度。为了在每个阶段为模型配备有效的算法，使用Taguchi方法来识别特征提取、聚类和分类技术的最佳组合。具体来说，使用双向编码器表示（BERT）进行深度上下文特征提取，使用深度嵌入聚类（DEC）进行聚类，使用支持向量机（SVM）进行分类。数值结果表明，该方法可以准确地将个体帖子划分为四个抑郁级别：非抑郁、低抑郁、中度抑郁和重度抑郁。这些发现表明，社交网络为通过文本分析评估心理健康提供了一个平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel hybrid clustering and classification framework for multi-level depression detection in social media texts

Depression is a prevalent psychiatric condition worldwide, with significant social and economic implications. Despite its high incidence, many individuals with depression remain undiagnosed and untreated. Meanwhile, people increasingly use social media platforms to express their emotions and thoughts. Consequently, leveraging these platforms for depression detection may help address several related challenges. This paper proposes a three-stage methodology, based on text mining techniques, to determine the severity of depression in individuals who post textual content on social media. In the proposed framework, each post is transformed into a vector of numerical features using established feature extraction methods. Principal Component Analysis is then applied to select the most informative features for identifying whether a post indicates depression via a classification algorithm. If depression is detected, the method clusters the relevant posts based on their characteristics, grouping similar texts together. A second classification model is then applied within each cluster to determine the level of depression. To equip the model with effective algorithms at each stage, the Taguchi method is used to identify the best combination of feature extraction, clustering, and classification techniques. Specifically, Bidirectional Encoder Representations from Transformers (BERT) is used for deep contextual feature extraction, Deep Embedded Clustering (DEC) is employed for clustering, and Support Vector Machine (SVM) is used for classification. Numerical results show that the proposed approach can accurately classify individuals’ posts into one of four depression levels: non-depressed, low, moderate, and severe. These findings suggest that social networks offer a platform for assessing mental health through textual analysis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.