Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care

IF 4.7 2区医学 Q1 PSYCHIATRY

Depression and Anxiety Pub Date : 2024-04-09 DOI:10.1155/2024/9667377

Jonathan F. Bauer, Maurice Gerczuk, Lena Schindler-Gmelch, Shahin Amiriparian, David Daniel Ebert, Jarek Krajewski, Björn Schuller, Matthias Berking

{"title":"Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine Care","authors":"Jonathan F. Bauer, Maurice Gerczuk, Lena Schindler-Gmelch, Shahin Amiriparian, David Daniel Ebert, Jarek Krajewski, Björn Schuller, Matthias Berking","doi":"10.1155/2024/9667377","DOIUrl":null,"url":null,"abstract":"<p>New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.</p>","PeriodicalId":55179,"journal":{"name":"Depression and Anxiety","volume":"2024 1","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Depression and Anxiety","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/9667377","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

Abstract

New developments in machine learning-based analysis of speech can be hypothesized to facilitate the long-term monitoring of major depressive disorder (MDD) during and after treatment. To test this hypothesis, we collected 550 speech samples from telephone-based clinical interviews with 267 individuals in routine care. With this data, we trained and evaluated a machine learning system to identify the absence/presence of a MDD diagnosis (as assessed with the Structured Clinical Interview for DSM-IV) from paralinguistic speech characteristics. Our system classified diagnostic status of MDD with an accuracy of 66% (sensitivity: 70%, specificity: 62%). Permutation tests indicated that the machine learning system classified MDD significantly better than chance. However, deriving diagnoses from cut-off scores of common depression scales was superior to the machine learning system with an accuracy of 73% for the Hamilton Rating Scale for Depression (HRSD), 74% for the Quick Inventory of Depressive Symptomatology–Clinician version (QIDS-C), and 73% for the depression module of the Patient Health Questionnaire (PHQ-9). Moreover, training a machine learning system that incorporated both speech analysis and depression scales resulted in accuracies between 73 and 76%. Thus, while findings of the present study demonstrate that automated speech analysis shows the potential of identifying patterns of depressed speech, it does not substantially improve the validity of classifications from common depression scales. In conclusion, speech analysis may not yet be able to replace common depression scales in clinical practice, since it cannot yet provide the necessary accuracy in depression detection. This trial is registered with DRKS00023670.

查看原文本刊更多论文

基于机器学习的重度抑郁障碍评估验证--从日常护理中的副语言特点出发

基于机器学习的语音分析技术的新发展可以促进对重度抑郁障碍（MDD）治疗期间和治疗后的长期监测。为了验证这一假设，我们从电话临床访谈中收集了 550 份语音样本，这些样本来自 267 名接受常规治疗的患者。利用这些数据，我们训练并评估了一个机器学习系统，该系统可从副语言语音特征中识别是否存在 MDD 诊断（根据 DSM-IV 结构化临床访谈进行评估）。我们的系统对 MDD 诊断状态进行分类的准确率为 66%（灵敏度：70%，特异性：62%）。置换测试表明，机器学习系统对 MDD 的分类明显优于偶然性。不过，根据常见抑郁量表的临界值得出诊断结果的准确率要高于机器学习系统，汉密尔顿抑郁量表（HRSD）的准确率为 73%，抑郁症状快速量表-医师版（QIDS-C）的准确率为 74%，患者健康问卷（PHQ-9）抑郁模块的准确率为 73%。此外，训练一个同时包含语音分析和抑郁量表的机器学习系统的准确率在 73% 到 76% 之间。因此，尽管本研究的结果表明，自动语音分析具有识别抑郁语音模式的潜力，但并不能大幅提高常见抑郁量表分类的有效性。总之，在临床实践中，语音分析可能还无法取代普通抑郁量表，因为它还不能提供抑郁检测所需的准确性。本试验的注册号为 DRKS00023670。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Depression and Anxiety 医学-精神病学

CiteScore

15.00

自引率

1.40%

发文量

审稿时长

4-8 weeks

期刊介绍： Depression and Anxiety is a scientific journal that focuses on the study of mood and anxiety disorders, as well as related phenomena in humans. The journal is dedicated to publishing high-quality research and review articles that contribute to the understanding and treatment of these conditions. The journal places a particular emphasis on articles that contribute to the clinical evaluation and care of individuals affected by mood and anxiety disorders. It prioritizes the publication of treatment-related research and review papers, as well as those that present novel findings that can directly impact clinical practice. The journal's goal is to advance the field by disseminating knowledge that can lead to better diagnosis, treatment, and management of these disorders, ultimately improving the quality of life for those who suffer from them.