Depression detection with machine learning of structural and non-structural dual languages

IF 3.3 Q3 ENGINEERING, BIOMEDICAL

Healthcare Technology Letters Pub Date : 2024-06-10 DOI:10.1049/htl2.12088

Filza Rehmani, Qaisar Shaheen, Muhammad Anwar, Muhammad Faheem, Shahzad Sarwar Bhatti

{"title":"Depression detection with machine learning of structural and non-structural dual languages","authors":"Filza Rehmani, Qaisar Shaheen, Muhammad Anwar, Muhammad Faheem, Shahzad Sarwar Bhatti","doi":"10.1049/htl2.12088","DOIUrl":null,"url":null,"abstract":"<p>Depression is a serious mental state that negatively impacts thoughts, feelings, and actions. Social media use is rapidly growing, with people expressing themselves in their regional languages. In Pakistan and India, many people use Roman Urdu on social media. This makes Roman Urdu important for predicting depression in these regions. However, previous studies show no significant contribution in predicting depression through Roman Urdu or in combination with structured languages like English. The study aims to create a Roman Urdu dataset to predict depression risk in dual languages [Roman Urdu (non-structural language) + English (structural language)]. Two datasets were used: Roman Urdu data manually converted from English on Facebook, and English comments from Kaggle. These datasets were merged for the research experiments. Machine learning models, including Support Vector Machine (SVM), Support Vector Machine Radial Basis Function (SVM-RBF), Random Forest (RF), and Bidirectional Encoder Representations from Transformers (BERT), were tested. Depression risk was classified into not depressed, moderate, and severe. Experimental studies show that the SVM achieved the best result with anaccuracy of 0.84% compared to existing models. The presented study refines thearea of depression to predict the depression in Asian countries.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 4","pages":"218-226"},"PeriodicalIF":3.3000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12088","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Healthcare Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/htl2.12088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Depression is a serious mental state that negatively impacts thoughts, feelings, and actions. Social media use is rapidly growing, with people expressing themselves in their regional languages. In Pakistan and India, many people use Roman Urdu on social media. This makes Roman Urdu important for predicting depression in these regions. However, previous studies show no significant contribution in predicting depression through Roman Urdu or in combination with structured languages like English. The study aims to create a Roman Urdu dataset to predict depression risk in dual languages [Roman Urdu (non-structural language) + English (structural language)]. Two datasets were used: Roman Urdu data manually converted from English on Facebook, and English comments from Kaggle. These datasets were merged for the research experiments. Machine learning models, including Support Vector Machine (SVM), Support Vector Machine Radial Basis Function (SVM-RBF), Random Forest (RF), and Bidirectional Encoder Representations from Transformers (BERT), were tested. Depression risk was classified into not depressed, moderate, and severe. Experimental studies show that the SVM achieved the best result with anaccuracy of 0.84% compared to existing models. The presented study refines thearea of depression to predict the depression in Asian countries.

Abstract Image

查看原文本刊更多论文

通过机器学习检测结构性和非结构性双重语言的抑郁症

抑郁症是一种严重的精神状态，会对思想、情感和行动产生负面影响。社交媒体的使用正在迅速增长，人们用自己的地区语言表达自己。在巴基斯坦和印度，很多人在社交媒体上使用罗马乌尔都语。因此，罗马乌尔都语对于预测这些地区的抑郁症非常重要。然而，以往的研究表明，通过罗马乌尔都语或结合英语等结构化语言来预测抑郁症的效果并不明显。本研究旨在创建一个罗马乌尔都语数据集，以预测双语言[罗马乌尔都语（非结构语言）+英语（结构语言）]的抑郁风险。研究使用了两个数据集：从 Facebook 上的英语手动转换而来的罗马乌尔都语数据，以及来自 Kaggle 的英语评论。这些数据集被合并用于研究实验。测试的机器学习模型包括支持向量机（SVM）、支持向量机径向基函数（SVM-RBF）、随机森林（RF）和来自变压器的双向编码器表示（BERT）。抑郁风险分为未抑郁、中度和重度。实验研究表明，与现有模型相比，SVM 的准确率为 0.84%，取得了最好的结果。本研究完善了抑郁症领域，可用于预测亚洲国家的抑郁症。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Healthcare Technology Letters Health Professions-Health Information Management

CiteScore

6.10

自引率

4.80%

发文量

审稿时长

22 weeks

期刊介绍： Healthcare Technology Letters aims to bring together an audience of biomedical and electrical engineers, physical and computer scientists, and mathematicians to enable the exchange of the latest ideas and advances through rapid online publication of original healthcare technology research. Major themes of the journal include (but are not limited to): Major technological/methodological areas: Biomedical signal processing Biomedical imaging and image processing Bioinstrumentation (sensors, wearable technologies, etc) Biomedical informatics Major application areas: Cardiovascular and respiratory systems engineering Neural engineering, neuromuscular systems Rehabilitation engineering Bio-robotics, surgical planning and biomechanics Therapeutic and diagnostic systems, devices and technologies Clinical engineering Healthcare information systems, telemedicine, mHealth.