Moo-Kwon Chung, Sang Yup Lee, Taeksoo Shin, Ji Young Park, Sangwon Hwang, Min-Hyuk Kim, Jinhee Lee, Kyoung-Joung Lee, Hyo-Sang Lim, Erdenebayar Urtnasan, YeonSu Jung, Dan-Kyung Kim, Eunji Shin, Jin-Kyung Lee
{"title":"BERT and BERTopic for screening clinical depression on open-ended text messages collected through a mobile application from older adults.","authors":"Moo-Kwon Chung, Sang Yup Lee, Taeksoo Shin, Ji Young Park, Sangwon Hwang, Min-Hyuk Kim, Jinhee Lee, Kyoung-Joung Lee, Hyo-Sang Lim, Erdenebayar Urtnasan, YeonSu Jung, Dan-Kyung Kim, Eunji Shin, Jin-Kyung Lee","doi":"10.1186/s12889-025-23337-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Despite the high suicide rate in South Korea, older adults are reluctant to see a psychiatrist. Recently, text mining has gained popularity to detect depression in social media posts, but older adults rarely use social media. However, more than 90% of them use smartphones. South Korea has also made a public effort to utilize a mobile application to manage chronic health problems. In these situations, this study explores the possibility of screening the risk of depression through textual data reporting major stressors collected from older adults via a mobile application.</p><p><strong>Methods: </strong>We collected the data regarding stress and depressive symptoms through our mobile application. Pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based Natural Language Processing (NLP) models were utilized, using Python and the Hugging Face Transformers. A total of 1,332 text messages collected from 230 participants were analyzed using BERT modeling to detect clinical depression, as screened by the PHQ-9. For Korean data, we used KcBERT and KLUE BERT. BERTopic and dynamic BERTopic were used to see what stress topics appeared among a high-risk group and how they changed.</p><p><strong>Results: </strong>The results demonstrate that KcBERT (precision = .89, recall = .86, F1 score = .87) was slightly better than KLUE BERT (precision = .81, recall = .78, F1 score = .79), although both performed well in identifying clinical depression. In BERTopic results, hierarchical clustering were re-grouped into four categories: financial problems, family-oriented stressful situations, physical and mental health problems, and work-related or acutely stressful situations. Dynamic BERTopic results show longitudinal changes. While event-related words such as family death or disease diagnosis were found more often for the cases when depression risk increased, words related to continued stressful situations appeared more often when the risk remained high.</p><p><strong>Conclusion: </strong>These results imply that collecting respondents' reports regarding stressful experiences can be useful to screen the risk of clinical depression. Including this function within a smartphone application publicly administered by community health care professionals can help monitor mental health in older adults. It can approach a hidden high-risk population suffering from depression in the community, providing enriched information about their risk factors.</p>","PeriodicalId":9039,"journal":{"name":"BMC Public Health","volume":"25 1","pages":"2161"},"PeriodicalIF":3.6000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150497/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12889-025-23337-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Despite the high suicide rate in South Korea, older adults are reluctant to see a psychiatrist. Recently, text mining has gained popularity to detect depression in social media posts, but older adults rarely use social media. However, more than 90% of them use smartphones. South Korea has also made a public effort to utilize a mobile application to manage chronic health problems. In these situations, this study explores the possibility of screening the risk of depression through textual data reporting major stressors collected from older adults via a mobile application.
Methods: We collected the data regarding stress and depressive symptoms through our mobile application. Pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based Natural Language Processing (NLP) models were utilized, using Python and the Hugging Face Transformers. A total of 1,332 text messages collected from 230 participants were analyzed using BERT modeling to detect clinical depression, as screened by the PHQ-9. For Korean data, we used KcBERT and KLUE BERT. BERTopic and dynamic BERTopic were used to see what stress topics appeared among a high-risk group and how they changed.
Results: The results demonstrate that KcBERT (precision = .89, recall = .86, F1 score = .87) was slightly better than KLUE BERT (precision = .81, recall = .78, F1 score = .79), although both performed well in identifying clinical depression. In BERTopic results, hierarchical clustering were re-grouped into four categories: financial problems, family-oriented stressful situations, physical and mental health problems, and work-related or acutely stressful situations. Dynamic BERTopic results show longitudinal changes. While event-related words such as family death or disease diagnosis were found more often for the cases when depression risk increased, words related to continued stressful situations appeared more often when the risk remained high.
Conclusion: These results imply that collecting respondents' reports regarding stressful experiences can be useful to screen the risk of clinical depression. Including this function within a smartphone application publicly administered by community health care professionals can help monitor mental health in older adults. It can approach a hidden high-risk population suffering from depression in the community, providing enriched information about their risk factors.
期刊介绍:
BMC Public Health is an open access, peer-reviewed journal that considers articles on the epidemiology of disease and the understanding of all aspects of public health. The journal has a special focus on the social determinants of health, the environmental, behavioral, and occupational correlates of health and disease, and the impact of health policies, practices and interventions on the community.