{"title":"用于语音情感识别的基于 Memristor 的渐进式分层构形器架构","authors":"Tianhao Zhao, Yue Zhou, Xiaofang Hu","doi":"10.1142/s0218127424501177","DOIUrl":null,"url":null,"abstract":"Speech Emotion Recognition (SER) is a challenging task characterized by the diversity and complexity of emotional expression. Due to its powerful feature extraction capabilities, Transformer Network (TN) demonstrates advantages and potential in SER. However, the limited size of available datasets and the difficulty of decoupling emotional features restrain its performance and present challenges in implementing SER on edge devices. To address these issues, we present a Memristor-based Progressive Hierarchical Conformer Architecture (MPCA) and design a conformer submodule that leverages convolution to mitigate TN’s limitations in SER. We propose attention-based feature decoupling, employing hierarchical extraction to decouple speaker characteristics and retain the relevant components, thereby obtaining reliable emotional features. Furthermore, we propose a reconfigurable circuit implementation scheme for MPCA based on operator multiplexing achieving flexible modules that can be dynamically adjusted based on the resources of edge devices, and the stability of the designed circuit is analyzed by simulation experiments with PSPICE. We show that the suggested MPCA demonstrates state-of-the-art performance in SER while significantly reducing system power consumption, offering a solution for SER implementation on edge devices.","PeriodicalId":506426,"journal":{"name":"International Journal of Bifurcation and Chaos","volume":" 15","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Memristor-Based Progressive Hierarchical Conformer Architecture for Speech Emotion Recognition\",\"authors\":\"Tianhao Zhao, Yue Zhou, Xiaofang Hu\",\"doi\":\"10.1142/s0218127424501177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech Emotion Recognition (SER) is a challenging task characterized by the diversity and complexity of emotional expression. Due to its powerful feature extraction capabilities, Transformer Network (TN) demonstrates advantages and potential in SER. However, the limited size of available datasets and the difficulty of decoupling emotional features restrain its performance and present challenges in implementing SER on edge devices. To address these issues, we present a Memristor-based Progressive Hierarchical Conformer Architecture (MPCA) and design a conformer submodule that leverages convolution to mitigate TN’s limitations in SER. We propose attention-based feature decoupling, employing hierarchical extraction to decouple speaker characteristics and retain the relevant components, thereby obtaining reliable emotional features. Furthermore, we propose a reconfigurable circuit implementation scheme for MPCA based on operator multiplexing achieving flexible modules that can be dynamically adjusted based on the resources of edge devices, and the stability of the designed circuit is analyzed by simulation experiments with PSPICE. We show that the suggested MPCA demonstrates state-of-the-art performance in SER while significantly reducing system power consumption, offering a solution for SER implementation on edge devices.\",\"PeriodicalId\":506426,\"journal\":{\"name\":\"International Journal of Bifurcation and Chaos\",\"volume\":\" 15\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Bifurcation and Chaos\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s0218127424501177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Bifurcation and Chaos","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0218127424501177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
语音情感识别(SER)是一项具有挑战性的任务,其特点是情感表达的多样性和复杂性。变压器网络(TN)具有强大的特征提取能力,因此在 SER 中显示出优势和潜力。然而,可用数据集的规模有限以及情感特征解耦的难度限制了它的性能,也给在边缘设备上实施 SER 带来了挑战。为了解决这些问题,我们提出了基于 Memristor 的渐进式分层构形器架构 (MPCA),并设计了一个构形器子模块,利用卷积来缓解 TN 在 SER 中的局限性。我们提出了基于注意力的特征解耦,利用分层提取来解耦说话者特征并保留相关成分,从而获得可靠的情感特征。此外,我们还提出了基于算子复用的 MPCA 可重构电路实现方案,实现了可根据边缘设备资源动态调整的灵活模块,并通过 PSPICE 仿真实验分析了所设计电路的稳定性。我们通过 PSPICE 仿真实验分析了所设计电路的稳定性,结果表明,所建议的 MPCA 在大幅降低系统功耗的同时,还展示了最先进的 SER 性能,为在边缘设备上实现 SER 提供了一种解决方案。
Memristor-Based Progressive Hierarchical Conformer Architecture for Speech Emotion Recognition
Speech Emotion Recognition (SER) is a challenging task characterized by the diversity and complexity of emotional expression. Due to its powerful feature extraction capabilities, Transformer Network (TN) demonstrates advantages and potential in SER. However, the limited size of available datasets and the difficulty of decoupling emotional features restrain its performance and present challenges in implementing SER on edge devices. To address these issues, we present a Memristor-based Progressive Hierarchical Conformer Architecture (MPCA) and design a conformer submodule that leverages convolution to mitigate TN’s limitations in SER. We propose attention-based feature decoupling, employing hierarchical extraction to decouple speaker characteristics and retain the relevant components, thereby obtaining reliable emotional features. Furthermore, we propose a reconfigurable circuit implementation scheme for MPCA based on operator multiplexing achieving flexible modules that can be dynamically adjusted based on the resources of edge devices, and the stability of the designed circuit is analyzed by simulation experiments with PSPICE. We show that the suggested MPCA demonstrates state-of-the-art performance in SER while significantly reducing system power consumption, offering a solution for SER implementation on edge devices.