{"title":"基于面部表情、声音和步态的抑郁检测","authors":"Ziqian Dai, Qiuping Li, Yichen Shang, Xin’an Wang","doi":"10.1109/ITNEC56291.2023.10082163","DOIUrl":null,"url":null,"abstract":"Depression is a mental illness that endangers patients’ physical and mental health and imposes burdens on family and society. More and more people suffer from depression nowadays, which increases medical pressure. Depression can be diagnosed by patients’ voice, facial expression and gait. The current study mostly bases on one modality or a fusion of two. In this paper, we gathered 234 pieces of gait video, interview audio and video, proposed our pipeline and compared the performance between three single modalities and multi-modal fusion. The facial expression has the best performance, audio comes second, and gait comes last. The fusion of modalities can improve performance. This can provide a basis for the choice of modality in automatic screening or auxiliary diagnosis of depression. We also evaluated our model on public data set AVEC 2013, AVEC 2014 and Emotion-gait, which verifies its validity.","PeriodicalId":218770,"journal":{"name":"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Depression Detection Based on Facial Expression, Audio and Gait\",\"authors\":\"Ziqian Dai, Qiuping Li, Yichen Shang, Xin’an Wang\",\"doi\":\"10.1109/ITNEC56291.2023.10082163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Depression is a mental illness that endangers patients’ physical and mental health and imposes burdens on family and society. More and more people suffer from depression nowadays, which increases medical pressure. Depression can be diagnosed by patients’ voice, facial expression and gait. The current study mostly bases on one modality or a fusion of two. In this paper, we gathered 234 pieces of gait video, interview audio and video, proposed our pipeline and compared the performance between three single modalities and multi-modal fusion. The facial expression has the best performance, audio comes second, and gait comes last. The fusion of modalities can improve performance. This can provide a basis for the choice of modality in automatic screening or auxiliary diagnosis of depression. We also evaluated our model on public data set AVEC 2013, AVEC 2014 and Emotion-gait, which verifies its validity.\",\"PeriodicalId\":218770,\"journal\":{\"name\":\"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITNEC56291.2023.10082163\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 6th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITNEC56291.2023.10082163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Depression Detection Based on Facial Expression, Audio and Gait
Depression is a mental illness that endangers patients’ physical and mental health and imposes burdens on family and society. More and more people suffer from depression nowadays, which increases medical pressure. Depression can be diagnosed by patients’ voice, facial expression and gait. The current study mostly bases on one modality or a fusion of two. In this paper, we gathered 234 pieces of gait video, interview audio and video, proposed our pipeline and compared the performance between three single modalities and multi-modal fusion. The facial expression has the best performance, audio comes second, and gait comes last. The fusion of modalities can improve performance. This can provide a basis for the choice of modality in automatic screening or auxiliary diagnosis of depression. We also evaluated our model on public data set AVEC 2013, AVEC 2014 and Emotion-gait, which verifies its validity.