Using the Short-Time Fourier Transform and ResNet to Diagnose Depression from Speech Data

Ayman Elfaki, A. L. Asnawi, A. Jusoh, A. F. Ismail, S. Ibrahim, N. F. Mohamed Azmin, Nik Nur Wahidah Binti Nik Hashim
{"title":"Using the Short-Time Fourier Transform and ResNet to Diagnose Depression from Speech Data","authors":"Ayman Elfaki, A. L. Asnawi, A. Jusoh, A. F. Ismail, S. Ibrahim, N. F. Mohamed Azmin, Nik Nur Wahidah Binti Nik Hashim","doi":"10.1109/ICOCO53166.2021.9673562","DOIUrl":null,"url":null,"abstract":"Depression is a common illness that is affecting many people nowadays, this is especially true now with the advent of the COVID-19 pandemic. It often arises when a person is having difficulty coping with stressful life events. It can occur throughout the lifespan of a person, and it pervades all aspects of our lives. Currently, depression diagnoses rely on patient interviews and self-report questionnaires, which depend heavily on the patient honesty and the subjective experience of the clinician. In this paper, we will begin with investigating the viability of using the Short-Time Fourier Transform (STFT) as a feature descriptor to objectively diagnose depression from speech data. The dataset used in this research is the Audio-Visual Emotion Challenging 2017 (AVEC2017). The model is based on a modified ResNet18 model architecture to perform a binary classification (i.e., depressed or non-depressed). The STFT is computed from the speech signal to generate a mel-spectrogram for training and testing the model. The experiment shows that relying solely on STFT as an input feature resulted in an F1 score of 74.71% in classifying depression.","PeriodicalId":262412,"journal":{"name":"2021 IEEE International Conference on Computing (ICOCO)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Computing (ICOCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOCO53166.2021.9673562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Depression is a common illness that is affecting many people nowadays, this is especially true now with the advent of the COVID-19 pandemic. It often arises when a person is having difficulty coping with stressful life events. It can occur throughout the lifespan of a person, and it pervades all aspects of our lives. Currently, depression diagnoses rely on patient interviews and self-report questionnaires, which depend heavily on the patient honesty and the subjective experience of the clinician. In this paper, we will begin with investigating the viability of using the Short-Time Fourier Transform (STFT) as a feature descriptor to objectively diagnose depression from speech data. The dataset used in this research is the Audio-Visual Emotion Challenging 2017 (AVEC2017). The model is based on a modified ResNet18 model architecture to perform a binary classification (i.e., depressed or non-depressed). The STFT is computed from the speech signal to generate a mel-spectrogram for training and testing the model. The experiment shows that relying solely on STFT as an input feature resulted in an F1 score of 74.71% in classifying depression.
利用短时傅里叶变换和ResNet从语音数据中诊断抑郁症
抑郁症是一种常见病,如今影响着许多人,随着COVID-19大流行的到来,这种情况尤其明显。它通常出现在一个人难以应对压力生活事件的时候。它可以发生在一个人的一生中,它遍及我们生活的方方面面。目前,抑郁症的诊断依赖于患者访谈和自我报告问卷,这在很大程度上依赖于患者的诚实和临床医生的主观经验。在本文中,我们将首先研究使用短时傅里叶变换(STFT)作为特征描述符从语音数据中客观诊断抑郁症的可行性。本研究使用的数据集是视听情感挑战2017 (AVEC2017)。该模型基于修改后的ResNet18模型架构来执行二元分类(即,抑制或非抑制)。从语音信号中计算STFT生成梅尔谱图,用于训练和测试模型。实验表明,单纯依靠STFT作为输入特征,对抑郁症进行分类的F1得分为74.71%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信