{"title":"基于混合四元数的回声状态网络双线性滤波器的语音情感识别","authors":"Fatemeh Daneshfar, S. J. Kabudian","doi":"10.1109/ICSPIS54653.2021.9729337","DOIUrl":null,"url":null,"abstract":"Echo state network (ESN) is one of the efficient tools for displaying dynamic data. However, there are limitations to model high-dimensional data by ESNs. The most important limitation is the high amount of memory consumed due to their echo state and the linear output of the ESN network, which prevents the increase of reservoir units and the effective use of higher-order statistics of the features provided by its reservoir units. In this research, a new structure based on ESN is presented, in which quaternion algebra is used to compress the network data with the simple split function, and the output linear combiner is replaced by a multidimensional bilinear filter. This filter will be used for nonlinear calculations of the output layer of the ESN. In addition, the two-dimensional principal component analysis (2dPCA) technique is used to reduce the number of data transferred to the bilinear filter. In this study, the coefficients and the weights of the quaternion nonlinear ESN (QNESN) are optimized using genetic algorithm (GA). In order to prove the effectiveness of the proposed model compared to the previous methods, experiments for speech emotion recognition (SER) have been performed on EMODB dataset. Comparisons show that the proposed QNESN network performs better than the simple ESN and most currently SER systems.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Speech Emotion Recognition Using a New Hybrid Quaternion-Based Echo State Network-Bilinear Filter\",\"authors\":\"Fatemeh Daneshfar, S. J. Kabudian\",\"doi\":\"10.1109/ICSPIS54653.2021.9729337\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Echo state network (ESN) is one of the efficient tools for displaying dynamic data. However, there are limitations to model high-dimensional data by ESNs. The most important limitation is the high amount of memory consumed due to their echo state and the linear output of the ESN network, which prevents the increase of reservoir units and the effective use of higher-order statistics of the features provided by its reservoir units. In this research, a new structure based on ESN is presented, in which quaternion algebra is used to compress the network data with the simple split function, and the output linear combiner is replaced by a multidimensional bilinear filter. This filter will be used for nonlinear calculations of the output layer of the ESN. In addition, the two-dimensional principal component analysis (2dPCA) technique is used to reduce the number of data transferred to the bilinear filter. In this study, the coefficients and the weights of the quaternion nonlinear ESN (QNESN) are optimized using genetic algorithm (GA). In order to prove the effectiveness of the proposed model compared to the previous methods, experiments for speech emotion recognition (SER) have been performed on EMODB dataset. Comparisons show that the proposed QNESN network performs better than the simple ESN and most currently SER systems.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729337\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729337","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech Emotion Recognition Using a New Hybrid Quaternion-Based Echo State Network-Bilinear Filter
Echo state network (ESN) is one of the efficient tools for displaying dynamic data. However, there are limitations to model high-dimensional data by ESNs. The most important limitation is the high amount of memory consumed due to their echo state and the linear output of the ESN network, which prevents the increase of reservoir units and the effective use of higher-order statistics of the features provided by its reservoir units. In this research, a new structure based on ESN is presented, in which quaternion algebra is used to compress the network data with the simple split function, and the output linear combiner is replaced by a multidimensional bilinear filter. This filter will be used for nonlinear calculations of the output layer of the ESN. In addition, the two-dimensional principal component analysis (2dPCA) technique is used to reduce the number of data transferred to the bilinear filter. In this study, the coefficients and the weights of the quaternion nonlinear ESN (QNESN) are optimized using genetic algorithm (GA). In order to prove the effectiveness of the proposed model compared to the previous methods, experiments for speech emotion recognition (SER) have been performed on EMODB dataset. Comparisons show that the proposed QNESN network performs better than the simple ESN and most currently SER systems.