Categorizing Mental Stress: A Consistency-Focused Benchmarking of ML and DL Models for Multi-Label, Multi-Class Classification via Taxonomy-Driven NLP Techniques

Juswin Sajan John , Boppuru Rudra Prathap , Gyanesh Gupta , Jaivanth Melanaturu
{"title":"Categorizing Mental Stress: A Consistency-Focused Benchmarking of ML and DL Models for Multi-Label, Multi-Class Classification via Taxonomy-Driven NLP Techniques","authors":"Juswin Sajan John ,&nbsp;Boppuru Rudra Prathap ,&nbsp;Gyanesh Gupta ,&nbsp;Jaivanth Melanaturu","doi":"10.1016/j.nlp.2025.100162","DOIUrl":null,"url":null,"abstract":"<div><div>Mental stress, a critical concern worldwide, necessitates precise and nuanced characterization. This study introduces a novel approach to effectively characterize mental stress through a multi-label, multi-class classification framework through natural language processing techniques. Building on existing literature, discussions with psychologists and other mental health practitioners, we developed a taxonomy of 27 distinctive markers spread across 4 label categories; aiming to create a preliminary screening tool leveraging textual data.</div><div>The core objective is to identify the most suitable model for this complex task, encompassing comprehensive evaluation of various machine learning and deep learning algorithms. we experimented with support vector machines (SVM), random forest (RF) and long short-term memory (LSTM) algorithms incorporating various feature combinations involving Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA). The best performer of this comparative study was further evaluated against an LLM.</div><div>The potential of large language models (LLMs), including their language understanding and prediction capabilities, is another key focus. We explore how these models could augment and advance mental health research, offering new perspectives and insights into the characterization of mental stress.</div><div>Our findings show that the top model, an LSTM with TF-IDF and LDA (class weights assigned) outperformed the PaLM model with a coefficient of variation as low as 0.87% across all labels. Despite the PaLM model’s superior average performance, it exhibited higher variability among different labels.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"11 ","pages":"Article 100162"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294971912500038X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Mental stress, a critical concern worldwide, necessitates precise and nuanced characterization. This study introduces a novel approach to effectively characterize mental stress through a multi-label, multi-class classification framework through natural language processing techniques. Building on existing literature, discussions with psychologists and other mental health practitioners, we developed a taxonomy of 27 distinctive markers spread across 4 label categories; aiming to create a preliminary screening tool leveraging textual data.
The core objective is to identify the most suitable model for this complex task, encompassing comprehensive evaluation of various machine learning and deep learning algorithms. we experimented with support vector machines (SVM), random forest (RF) and long short-term memory (LSTM) algorithms incorporating various feature combinations involving Term Frequency-Inverse Document Frequency (TF-IDF) and Latent Dirichlet Allocation (LDA). The best performer of this comparative study was further evaluated against an LLM.
The potential of large language models (LLMs), including their language understanding and prediction capabilities, is another key focus. We explore how these models could augment and advance mental health research, offering new perspectives and insights into the characterization of mental stress.
Our findings show that the top model, an LSTM with TF-IDF and LDA (class weights assigned) outperformed the PaLM model with a coefficient of variation as low as 0.87% across all labels. Despite the PaLM model’s superior average performance, it exhibited higher variability among different labels.
对精神压力进行分类:通过分类驱动的NLP技术对多标签、多类分类的ML和DL模型进行一致性基准测试
精神压力是世界范围内的一个重要问题,需要精确而细致的描述。本研究提出了一种基于自然语言处理技术的多标签、多类别分类框架来有效表征心理压力的新方法。在现有文献的基础上,与心理学家和其他心理健康从业人员进行了讨论,我们制定了一个27个不同标记的分类法,分布在4个标签类别中;旨在创建一个利用文本数据的初步筛选工具。核心目标是为这一复杂任务确定最合适的模型,包括对各种机器学习和深度学习算法的综合评估。我们对支持向量机(SVM)、随机森林(RF)和长短期记忆(LSTM)算法进行了实验,这些算法结合了包括词频-逆文档频率(TF-IDF)和潜在狄利克雷分配(LDA)在内的各种特征组合。在这项比较研究中表现最好的学生被进一步评估为法学硕士。大型语言模型(llm)的潜力,包括其语言理解和预测能力,是另一个关键焦点。我们探索这些模型如何能够增强和推进心理健康研究,为心理压力的表征提供新的视角和见解。我们的研究结果表明,顶级模型,一个具有TF-IDF和LDA(类权重分配)的LSTM,在所有标签上的变异系数低至0.87%,优于PaLM模型。尽管PaLM模型的平均性能优越,但它在不同标签之间表现出更高的可变性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信