BERT and BERTopic for screening clinical depression on open-ended text messages collected through a mobile application from older adults.

IF 3.6 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Moo-Kwon Chung, Sang Yup Lee, Taeksoo Shin, Ji Young Park, Sangwon Hwang, Min-Hyuk Kim, Jinhee Lee, Kyoung-Joung Lee, Hyo-Sang Lim, Erdenebayar Urtnasan, YeonSu Jung, Dan-Kyung Kim, Eunji Shin, Jin-Kyung Lee
{"title":"BERT and BERTopic for screening clinical depression on open-ended text messages collected through a mobile application from older adults.","authors":"Moo-Kwon Chung, Sang Yup Lee, Taeksoo Shin, Ji Young Park, Sangwon Hwang, Min-Hyuk Kim, Jinhee Lee, Kyoung-Joung Lee, Hyo-Sang Lim, Erdenebayar Urtnasan, YeonSu Jung, Dan-Kyung Kim, Eunji Shin, Jin-Kyung Lee","doi":"10.1186/s12889-025-23337-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Despite the high suicide rate in South Korea, older adults are reluctant to see a psychiatrist. Recently, text mining has gained popularity to detect depression in social media posts, but older adults rarely use social media. However, more than 90% of them use smartphones. South Korea has also made a public effort to utilize a mobile application to manage chronic health problems. In these situations, this study explores the possibility of screening the risk of depression through textual data reporting major stressors collected from older adults via a mobile application.</p><p><strong>Methods: </strong>We collected the data regarding stress and depressive symptoms through our mobile application. Pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based Natural Language Processing (NLP) models were utilized, using Python and the Hugging Face Transformers. A total of 1,332 text messages collected from 230 participants were analyzed using BERT modeling to detect clinical depression, as screened by the PHQ-9. For Korean data, we used KcBERT and KLUE BERT. BERTopic and dynamic BERTopic were used to see what stress topics appeared among a high-risk group and how they changed.</p><p><strong>Results: </strong>The results demonstrate that KcBERT (precision = .89, recall = .86, F1 score = .87) was slightly better than KLUE BERT (precision = .81, recall = .78, F1 score = .79), although both performed well in identifying clinical depression. In BERTopic results, hierarchical clustering were re-grouped into four categories: financial problems, family-oriented stressful situations, physical and mental health problems, and work-related or acutely stressful situations. Dynamic BERTopic results show longitudinal changes. While event-related words such as family death or disease diagnosis were found more often for the cases when depression risk increased, words related to continued stressful situations appeared more often when the risk remained high.</p><p><strong>Conclusion: </strong>These results imply that collecting respondents' reports regarding stressful experiences can be useful to screen the risk of clinical depression. Including this function within a smartphone application publicly administered by community health care professionals can help monitor mental health in older adults. It can approach a hidden high-risk population suffering from depression in the community, providing enriched information about their risk factors.</p>","PeriodicalId":9039,"journal":{"name":"BMC Public Health","volume":"25 1","pages":"2161"},"PeriodicalIF":3.6000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150497/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12889-025-23337-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Despite the high suicide rate in South Korea, older adults are reluctant to see a psychiatrist. Recently, text mining has gained popularity to detect depression in social media posts, but older adults rarely use social media. However, more than 90% of them use smartphones. South Korea has also made a public effort to utilize a mobile application to manage chronic health problems. In these situations, this study explores the possibility of screening the risk of depression through textual data reporting major stressors collected from older adults via a mobile application.

Methods: We collected the data regarding stress and depressive symptoms through our mobile application. Pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based Natural Language Processing (NLP) models were utilized, using Python and the Hugging Face Transformers. A total of 1,332 text messages collected from 230 participants were analyzed using BERT modeling to detect clinical depression, as screened by the PHQ-9. For Korean data, we used KcBERT and KLUE BERT. BERTopic and dynamic BERTopic were used to see what stress topics appeared among a high-risk group and how they changed.

Results: The results demonstrate that KcBERT (precision = .89, recall = .86, F1 score = .87) was slightly better than KLUE BERT (precision = .81, recall = .78, F1 score = .79), although both performed well in identifying clinical depression. In BERTopic results, hierarchical clustering were re-grouped into four categories: financial problems, family-oriented stressful situations, physical and mental health problems, and work-related or acutely stressful situations. Dynamic BERTopic results show longitudinal changes. While event-related words such as family death or disease diagnosis were found more often for the cases when depression risk increased, words related to continued stressful situations appeared more often when the risk remained high.

Conclusion: These results imply that collecting respondents' reports regarding stressful experiences can be useful to screen the risk of clinical depression. Including this function within a smartphone application publicly administered by community health care professionals can help monitor mental health in older adults. It can approach a hidden high-risk population suffering from depression in the community, providing enriched information about their risk factors.

BERT和BERTopic通过手机应用程序收集老年人的开放式短信来筛选临床抑郁症。
背景:尽管韩国的自杀率很高,但老年人不愿去看精神科医生。最近,文本挖掘在社交媒体帖子中检测抑郁症的方法越来越受欢迎,但老年人很少使用社交媒体。然而,超过90%的人使用智能手机。韩国也在利用移动应用程序管理慢性健康问题方面做出了公开努力。在这种情况下,本研究探索了通过手机应用程序从老年人收集的主要压力源的文本数据报告来筛选抑郁风险的可能性。方法:通过手机应用程序收集有关压力和抑郁症状的数据。使用Python和拥抱脸转换器,利用基于BERT的自然语言处理(NLP)模型的预训练双向编码器表示。从230名参与者中收集的1,332条短信使用BERT模型进行分析,以检测临床抑郁症,并通过PHQ-9筛选。对于韩语数据,我们使用KcBERT和KLUE BERT。BERTopic和动态BERTopic被用来观察在高风险人群中出现的压力话题以及它们是如何变化的。结果:KcBERT (precision =。89、回忆=。86, F1评分= .87)略好于KLUE BERT(精度=。回忆=。78, F1得分= .79),尽管两者在识别临床抑郁症方面表现良好。在BERTopic的结果中,分层聚类重新分组为四类:财务问题,面向家庭的压力情况,身心健康问题,工作或急性压力情况。动态BERTopic结果显示纵向变化。与事件相关的词汇,如家庭死亡或疾病诊断,在抑郁风险增加的情况下出现的频率更高,而与持续压力情况相关的词汇,在风险仍然很高的情况下出现的频率更高。结论:收集被调查者关于压力经历的报告有助于筛查临床抑郁风险。在社区卫生保健专业人员公开管理的智能手机应用程序中加入这一功能可以帮助监测老年人的心理健康状况。它可以接近社区中隐藏的患有抑郁症的高危人群,提供有关其风险因素的丰富信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Public Health
BMC Public Health 医学-公共卫生、环境卫生与职业卫生
CiteScore
6.50
自引率
4.40%
发文量
2108
审稿时长
1 months
期刊介绍: BMC Public Health is an open access, peer-reviewed journal that considers articles on the epidemiology of disease and the understanding of all aspects of public health. The journal has a special focus on the social determinants of health, the environmental, behavioral, and occupational correlates of health and disease, and the impact of health policies, practices and interventions on the community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信