通过社交媒体文本对临床抑郁症进行深度时空建模

Natural Language Processing Journal Pub Date : 2024-01-09 DOI:10.1016/j.nlp.2023.100052

Nawshad Farruque , Randy Goebel , Sudhakar Sivapalan , Osmar Zaïane

{"title":"通过社交媒体文本对临床抑郁症进行深度时空建模","authors":"Nawshad Farruque , Randy Goebel , Sudhakar Sivapalan , Osmar Zaïane","doi":"10.1016/j.nlp.2023.100052","DOIUrl":null,"url":null,"abstract":"<div><p>We describe the development of a model to detect user-level clinical depression based on a user’s temporal social media posts. Our model uses a Depression Symptoms Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms. We subsequently use our DSD model to extract clinically relevant features, e.g., depression scores and their consequent temporal patterns, as well as user posting activity patterns, e.g., quantifying their “no activity” or “silence.” Furthermore, to evaluate the efficacy of these extracted features, we create three kinds of datasets including a test dataset, from two existing well-known benchmark datasets for user-level depression detection. We then provide accuracy measures based on single features, baseline features and feature ablation tests, at several different levels of temporal granularity. The relevant data distributions and clinical depression detection related settings can be exploited to draw a complete picture of the impact of different features across our created datasets. Finally, we show that, in general, only semantic oriented representation models perform well. However, clinical features may enhance overall performance provided that the training and testing distribution is similar, and there is more data in a user’s timeline. The consequence is that the predictive capability of depression scores increase significantly while used in a more sensitive clinical depression detection settings.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"6 ","pages":"Article 100052"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719123000493/pdfft?md5=0d6383093fc7867b461d44edd1c64ce4&pid=1-s2.0-S2949719123000493-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Deep temporal modelling of clinical depression through social media text\",\"authors\":\"Nawshad Farruque , Randy Goebel , Sudhakar Sivapalan , Osmar Zaïane\",\"doi\":\"10.1016/j.nlp.2023.100052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We describe the development of a model to detect user-level clinical depression based on a user’s temporal social media posts. Our model uses a Depression Symptoms Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms. We subsequently use our DSD model to extract clinically relevant features, e.g., depression scores and their consequent temporal patterns, as well as user posting activity patterns, e.g., quantifying their “no activity” or “silence.” Furthermore, to evaluate the efficacy of these extracted features, we create three kinds of datasets including a test dataset, from two existing well-known benchmark datasets for user-level depression detection. We then provide accuracy measures based on single features, baseline features and feature ablation tests, at several different levels of temporal granularity. The relevant data distributions and clinical depression detection related settings can be exploited to draw a complete picture of the impact of different features across our created datasets. Finally, we show that, in general, only semantic oriented representation models perform well. However, clinical features may enhance overall performance provided that the training and testing distribution is similar, and there is more data in a user’s timeline. The consequence is that the predictive capability of depression scores increase significantly while used in a more sensitive clinical depression detection settings.</p></div>\",\"PeriodicalId\":100944,\"journal\":{\"name\":\"Natural Language Processing Journal\",\"volume\":\"6 \",\"pages\":\"Article 100052\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2949719123000493/pdfft?md5=0d6383093fc7867b461d44edd1c64ce4&pid=1-s2.0-S2949719123000493-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Language Processing Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949719123000493\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719123000493","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们介绍了基于用户的社交媒体时间帖子来检测用户级临床抑郁症的模型的开发过程。我们的模型使用抑郁症状检测（DSD）分类器，该分类器是在现有最大的临床医生标注临床抑郁症状的推文样本上进行训练的。随后，我们使用 DSD 模型提取与临床相关的特征，如抑郁评分及其随之而来的时间模式，以及用户的发布活动模式，如量化其 "无活动 "或 "沉默"。此外，为了评估这些提取特征的有效性，我们从现有的两个著名的用户级抑郁检测基准数据集中创建了三种数据集，其中包括一个测试数据集。然后，我们提供了基于单一特征、基线特征和特征消减测试的准确度测量，以及多个不同层次的时间粒度。我们可以利用相关的数据分布和临床抑郁检测相关设置来全面了解不同特征对我们创建的数据集的影响。最后，我们表明，一般来说，只有面向语义的表示模型表现良好。但是，如果训练和测试分布相似，且用户时间轴上的数据较多，临床特征可能会提高整体性能。因此，当抑郁评分用于更敏感的临床抑郁检测设置时，其预测能力会显著提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep temporal modelling of clinical depression through social media text

We describe the development of a model to detect user-level clinical depression based on a user’s temporal social media posts. Our model uses a Depression Symptoms Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms. We subsequently use our DSD model to extract clinically relevant features, e.g., depression scores and their consequent temporal patterns, as well as user posting activity patterns, e.g., quantifying their “no activity” or “silence.” Furthermore, to evaluate the efficacy of these extracted features, we create three kinds of datasets including a test dataset, from two existing well-known benchmark datasets for user-level depression detection. We then provide accuracy measures based on single features, baseline features and feature ablation tests, at several different levels of temporal granularity. The relevant data distributions and clinical depression detection related settings can be exploited to draw a complete picture of the impact of different features across our created datasets. Finally, we show that, in general, only semantic oriented representation models perform well. However, clinical features may enhance overall performance provided that the training and testing distribution is similar, and there is more data in a user’s timeline. The consequence is that the predictive capability of depression scores increase significantly while used in a more sensitive clinical depression detection settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural Language Processing Journal

自引率

0.00%

发文量