{"title":"A Hybrid Network Speech Recognition Method for English Short Passage Reading Emotion Analysis in Multi-Access Edge Intelligence Scenarios","authors":"Jun Liao","doi":"10.1002/itl2.70108","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Speech emotion recognition based on edge computing technology and deep learning can effectively assist in improving the quality of English short passage reading instruction. Restricted by limited computing resources of different edge devices, existing deep models pose a huge challenge for mobile deployment. To alleviate this issue, this paper proposes a novel hybrid speech emotion recognition model in multi-access edge intelligence scenarios. Firstly, we extract the Log Mel features from the speech signal collected by different clients' microphone sensors. Then, on the cloud platform, we deploy an efficient feature extraction backbone by exploiting 1D convolution operations, a minimal gated unit (MGU) module, and a Mamba module, which is introduced for exploiting long-range dependencies with linear computational complexity. We conducted extensive comparative experiments on the public dataset and our own English reading sentiment dataset, and our proposed model achieved the highest recognition performance.</p>\n </div>","PeriodicalId":100725,"journal":{"name":"Internet Technology Letters","volume":"8 6","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet Technology Letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/itl2.70108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Speech emotion recognition based on edge computing technology and deep learning can effectively assist in improving the quality of English short passage reading instruction. Restricted by limited computing resources of different edge devices, existing deep models pose a huge challenge for mobile deployment. To alleviate this issue, this paper proposes a novel hybrid speech emotion recognition model in multi-access edge intelligence scenarios. Firstly, we extract the Log Mel features from the speech signal collected by different clients' microphone sensors. Then, on the cloud platform, we deploy an efficient feature extraction backbone by exploiting 1D convolution operations, a minimal gated unit (MGU) module, and a Mamba module, which is introduced for exploiting long-range dependencies with linear computational complexity. We conducted extensive comparative experiments on the public dataset and our own English reading sentiment dataset, and our proposed model achieved the highest recognition performance.