{"title":"Speech Emotion Recognition using Hybrid Architectures","authors":"Michael Norval, Zenghui Wang","doi":"10.47839/ijc.23.1.3430","DOIUrl":null,"url":null,"abstract":"The detection of human emotions from speech signals remains a challenging frontier in audio processing and human-computer interaction domains. This study introduces a novel approach to Speech Emotion Recognition (SER) using a Dendritic Layer combined with a Capsule Network (DendCaps). A Convolutional Neural Network (NN) and a Long Short-Time Neural Network (CLSTM) hybrid model are used to create a baseline which is then compared to the DendCap model. Integrating dendritic layers and capsule networks for speech emotion detection can harness the unique advantages of both architectures, potentially leading to more sophisticated and accurate models. Dendritic layers, inspired by the nonlinear processing properties of dendritic trees in biological neurons, can handle the intricate patterns and variabilities inherent in speech signals, while capsule networks, with their dynamic routing mechanisms, are adept at preserving hierarchical spatial relationships within the data, enabling the model to capture more refined emotional subtleties in human speech. The main motivation for using DendCaps is to bridge the gap between the capabilities of biological neural systems and artificial neural networks. This combination aims to capitalize on the hierarchical nature of speech data, where intricate patterns and dependencies can be better captured. Finally, two ensemble methods namely stacking and boosting are used for evaluating the CLSTM and DendCaps networks and the experimental results show that stacking of the CLSTM and DendCaps networks gives the superior result with a 75% accuracy.","PeriodicalId":37669,"journal":{"name":"International Journal of Computing","volume":"17 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47839/ijc.23.1.3430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
The detection of human emotions from speech signals remains a challenging frontier in audio processing and human-computer interaction domains. This study introduces a novel approach to Speech Emotion Recognition (SER) using a Dendritic Layer combined with a Capsule Network (DendCaps). A Convolutional Neural Network (NN) and a Long Short-Time Neural Network (CLSTM) hybrid model are used to create a baseline which is then compared to the DendCap model. Integrating dendritic layers and capsule networks for speech emotion detection can harness the unique advantages of both architectures, potentially leading to more sophisticated and accurate models. Dendritic layers, inspired by the nonlinear processing properties of dendritic trees in biological neurons, can handle the intricate patterns and variabilities inherent in speech signals, while capsule networks, with their dynamic routing mechanisms, are adept at preserving hierarchical spatial relationships within the data, enabling the model to capture more refined emotional subtleties in human speech. The main motivation for using DendCaps is to bridge the gap between the capabilities of biological neural systems and artificial neural networks. This combination aims to capitalize on the hierarchical nature of speech data, where intricate patterns and dependencies can be better captured. Finally, two ensemble methods namely stacking and boosting are used for evaluating the CLSTM and DendCaps networks and the experimental results show that stacking of the CLSTM and DendCaps networks gives the superior result with a 75% accuracy.
期刊介绍:
The International Journal of Computing Journal was established in 2002 on the base of Branch Research Laboratory for Automated Systems and Networks, since 2005 it’s renamed as Research Institute of Intelligent Computer Systems. A goal of the Journal is to publish papers with the novel results in Computing Science and Computer Engineering and Information Technologies and Software Engineering and Information Systems within the Journal topics. The official language of the Journal is English; also papers abstracts in both Ukrainian and Russian languages are published there. The issues of the Journal are published quarterly. The Editorial Board consists of about 30 recognized worldwide scientists.