Zijun Yang , Shi Zhou , Lifeng Zhang , Seiichi Serikawa
{"title":"Optimizing Speech Emotion Recognition with Hilbert Curve and convolutional neural network","authors":"Zijun Yang , Shi Zhou , Lifeng Zhang , Seiichi Serikawa","doi":"10.1016/j.cogr.2023.12.001","DOIUrl":null,"url":null,"abstract":"<div><p>In the realm of speech emotion recognition, researchers strive to refine representation methods for improved emotional information capture. Traditional one-dimensional time series classification falls short in expressing intricate emotional patterns present in speech signals, posing challenges in accuracy and robustness. This study introduces an innovative algorithm leveraging Hilbert curves to transform one-dimensional speech data into two-dimensional form, enhancing feature extraction accuracy. A tiling module based on Hilbert curve maximizes Hilbert curve arrangements for improved emotional information capture. Results reveal spatial efficiency gains up to 23,195 times pixel units, enhancing data storage. With an exceptional 98.73% accuracy, the proposed approach traditional methods, affirming its superior emotion classification performance on the same dataset. These empirical findings underscore the effectiveness of our proposed method in advancing speech emotion recognition.</p></div>","PeriodicalId":100288,"journal":{"name":"Cognitive Robotics","volume":"4 ","pages":"Pages 30-41"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667241323000411/pdfft?md5=bfed8ff77493b33cdfb6f93a3ba0a2c9&pid=1-s2.0-S2667241323000411-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667241323000411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of speech emotion recognition, researchers strive to refine representation methods for improved emotional information capture. Traditional one-dimensional time series classification falls short in expressing intricate emotional patterns present in speech signals, posing challenges in accuracy and robustness. This study introduces an innovative algorithm leveraging Hilbert curves to transform one-dimensional speech data into two-dimensional form, enhancing feature extraction accuracy. A tiling module based on Hilbert curve maximizes Hilbert curve arrangements for improved emotional information capture. Results reveal spatial efficiency gains up to 23,195 times pixel units, enhancing data storage. With an exceptional 98.73% accuracy, the proposed approach traditional methods, affirming its superior emotion classification performance on the same dataset. These empirical findings underscore the effectiveness of our proposed method in advancing speech emotion recognition.