Muhammad Syafiq Nordin, A. L. Asnawi, Nur Aishah Binti Zainal, R. F. Olanrewaju, A. Jusoh, S. Ibrahim, N. F. M. Azmin
{"title":"基于TEO和MFCC语音特征的卷积神经网络(CNN)应力检测","authors":"Muhammad Syafiq Nordin, A. L. Asnawi, Nur Aishah Binti Zainal, R. F. Olanrewaju, A. Jusoh, S. Ibrahim, N. F. M. Azmin","doi":"10.1109/ICOCO56118.2022.10031771","DOIUrl":null,"url":null,"abstract":"The effect of stress on mental and physical health is very concerning making it a fascinating and socially valuable field of study nowadays. Although a number of stress markers have been deployed, there are still issues involved with using these kinds of approaches. By developing a speech-based stress detection system, it could solve the problems faced by other currently available methods of detecting stress since it is a non-invasive and contactless approach. In this work, a fusion of Teager Energy Operator (TEO) and Mel Frequency Cepstral Coefficients (MFCC) namely Teager-MFCC (T-MFCC) are proposed as the speech features to be extracted from speech signals in recognizing stressed emotions. Since stressed emotions affect the nonlinear components of speech, TEO is applied to reflect the instantaneous energy of the components. Convolutional Neural Network (CNN) classifier is used with the proposed T- MFCC features on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) corpus. The proposed method (T-MFCC) had shown a better performance with classification accuracies of 95.83% and 95.37% for male and female speakers respectively compared to the MFCC feature extraction technique which achieves 84.26% (male) and 93.98% (female) classification accuracies.","PeriodicalId":319652,"journal":{"name":"2022 IEEE International Conference on Computing (ICOCO)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Stress Detection based on TEO and MFCC speech features using Convolutional Neural Networks (CNN)\",\"authors\":\"Muhammad Syafiq Nordin, A. L. Asnawi, Nur Aishah Binti Zainal, R. F. Olanrewaju, A. Jusoh, S. Ibrahim, N. F. M. Azmin\",\"doi\":\"10.1109/ICOCO56118.2022.10031771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The effect of stress on mental and physical health is very concerning making it a fascinating and socially valuable field of study nowadays. Although a number of stress markers have been deployed, there are still issues involved with using these kinds of approaches. By developing a speech-based stress detection system, it could solve the problems faced by other currently available methods of detecting stress since it is a non-invasive and contactless approach. In this work, a fusion of Teager Energy Operator (TEO) and Mel Frequency Cepstral Coefficients (MFCC) namely Teager-MFCC (T-MFCC) are proposed as the speech features to be extracted from speech signals in recognizing stressed emotions. Since stressed emotions affect the nonlinear components of speech, TEO is applied to reflect the instantaneous energy of the components. Convolutional Neural Network (CNN) classifier is used with the proposed T- MFCC features on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) corpus. The proposed method (T-MFCC) had shown a better performance with classification accuracies of 95.83% and 95.37% for male and female speakers respectively compared to the MFCC feature extraction technique which achieves 84.26% (male) and 93.98% (female) classification accuracies.\",\"PeriodicalId\":319652,\"journal\":{\"name\":\"2022 IEEE International Conference on Computing (ICOCO)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Computing (ICOCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOCO56118.2022.10031771\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Computing (ICOCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOCO56118.2022.10031771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stress Detection based on TEO and MFCC speech features using Convolutional Neural Networks (CNN)
The effect of stress on mental and physical health is very concerning making it a fascinating and socially valuable field of study nowadays. Although a number of stress markers have been deployed, there are still issues involved with using these kinds of approaches. By developing a speech-based stress detection system, it could solve the problems faced by other currently available methods of detecting stress since it is a non-invasive and contactless approach. In this work, a fusion of Teager Energy Operator (TEO) and Mel Frequency Cepstral Coefficients (MFCC) namely Teager-MFCC (T-MFCC) are proposed as the speech features to be extracted from speech signals in recognizing stressed emotions. Since stressed emotions affect the nonlinear components of speech, TEO is applied to reflect the instantaneous energy of the components. Convolutional Neural Network (CNN) classifier is used with the proposed T- MFCC features on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) corpus. The proposed method (T-MFCC) had shown a better performance with classification accuracies of 95.83% and 95.37% for male and female speakers respectively compared to the MFCC feature extraction technique which achieves 84.26% (male) and 93.98% (female) classification accuracies.