基于文本数据的社交媒体抑郁分类方法研究

Yiqing Zhou, Junzi Zhang
{"title":"基于文本数据的社交媒体抑郁分类方法研究","authors":"Yiqing Zhou, Junzi Zhang","doi":"10.1109/ISoIRS57349.2022.00030","DOIUrl":null,"url":null,"abstract":"The cause of depression is not well understood by researchers at present, but its existence has seriously harmed people's health. Therefore, it is very important to quickly judge whether it is depression in today's society. In the work, the text dataset from social media is first vectorized with Bag-of-words and self-training Word2Vec, and then the text represented by Bag-of-words is trained and tested in different types of traditional machine learning classification methods. For text represented by Word2Vec, CNN and RNN algorithms are added for training. Finally, we compare and analyse the classification effects of traditional machine learning algorithms and deep learning algorithms with different text vectorization representations on text dataset, so as to find a better classification model of depression based on text dataset. The experimental results show that CNN and Logistic Regression model are better in the task of depression classification based on text dataset.","PeriodicalId":405065,"journal":{"name":"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Study on Social Media Depression Classification Method Based on Text Data\",\"authors\":\"Yiqing Zhou, Junzi Zhang\",\"doi\":\"10.1109/ISoIRS57349.2022.00030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The cause of depression is not well understood by researchers at present, but its existence has seriously harmed people's health. Therefore, it is very important to quickly judge whether it is depression in today's society. In the work, the text dataset from social media is first vectorized with Bag-of-words and self-training Word2Vec, and then the text represented by Bag-of-words is trained and tested in different types of traditional machine learning classification methods. For text represented by Word2Vec, CNN and RNN algorithms are added for training. Finally, we compare and analyse the classification effects of traditional machine learning algorithms and deep learning algorithms with different text vectorization representations on text dataset, so as to find a better classification model of depression based on text dataset. The experimental results show that CNN and Logistic Regression model are better in the task of depression classification based on text dataset.\",\"PeriodicalId\":405065,\"journal\":{\"name\":\"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISoIRS57349.2022.00030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Intelligent Robotics and Systems (ISoIRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISoIRS57349.2022.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

抑郁症的成因目前研究人员还不清楚,但它的存在已经严重危害了人们的健康。因此,在当今社会,快速判断是否为抑郁症是非常重要的。在本研究中,首先使用Bag-of-words和自训练Word2Vec对来自社交媒体的文本数据集进行矢量化,然后使用不同类型的传统机器学习分类方法对Bag-of-words表示的文本进行训练和测试。对于Word2Vec表示的文本,加入CNN和RNN算法进行训练。最后,我们比较分析了传统机器学习算法和深度学习算法在不同文本向量化表示下对文本数据集的分类效果,从而找到更好的基于文本数据集的抑郁症分类模型。实验结果表明,CNN和Logistic回归模型在基于文本数据集的抑郁症分类任务中表现较好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Study on Social Media Depression Classification Method Based on Text Data
The cause of depression is not well understood by researchers at present, but its existence has seriously harmed people's health. Therefore, it is very important to quickly judge whether it is depression in today's society. In the work, the text dataset from social media is first vectorized with Bag-of-words and self-training Word2Vec, and then the text represented by Bag-of-words is trained and tested in different types of traditional machine learning classification methods. For text represented by Word2Vec, CNN and RNN algorithms are added for training. Finally, we compare and analyse the classification effects of traditional machine learning algorithms and deep learning algorithms with different text vectorization representations on text dataset, so as to find a better classification model of depression based on text dataset. The experimental results show that CNN and Logistic Regression model are better in the task of depression classification based on text dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信