{"title":"Neural Networks Using Multiplicative Features Based on Second-Order Statistics for Acoustic and Speech Applications","authors":"A. Kobayashi","doi":"10.1109/ARACE56528.2022.00029","DOIUrl":null,"url":null,"abstract":"This paper investigates multiplicative interactions such as auto-correlations between features in neural networks. Conventionally, in the field of pattern recognition, including spoken language processing, non-linear relationships among features, e.g., high-order local auto-correlations and multiplicative features seen in sigma-pi cells, have been explored. These features are specifically designed to capture the correlations in the spectro-temporal regions to gain robustness for classification. However, the features based on the multiplicative interactions, or elementary second-order statistics like autocorrelations, have not been well explored in speech processing. Accordingly, there would be open to discussion about the performance improvement of classification problems employing multiplicative features. Thus, we will investigate the multiplicative interactions extracted from spectro-temporal regions through the neural networks. We will conduct the experiments on three kinds of classification tasks, i.e., acoustic event/scene classification and speech recognition, while implementing a simple multiplicative module to produce the interactions between features. Our proposed neural networks with multiplicative blocks achieved promising improvements in all tasks, and the experimental results show that the proposed method improved accuracy by 0.45 % in the acoustic event classification, by 2.15 % in the acoustic scene classification, and the phone error rate (PER) by 6.5 % in the phoneme recognition.","PeriodicalId":437892,"journal":{"name":"2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARACE56528.2022.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper investigates multiplicative interactions such as auto-correlations between features in neural networks. Conventionally, in the field of pattern recognition, including spoken language processing, non-linear relationships among features, e.g., high-order local auto-correlations and multiplicative features seen in sigma-pi cells, have been explored. These features are specifically designed to capture the correlations in the spectro-temporal regions to gain robustness for classification. However, the features based on the multiplicative interactions, or elementary second-order statistics like autocorrelations, have not been well explored in speech processing. Accordingly, there would be open to discussion about the performance improvement of classification problems employing multiplicative features. Thus, we will investigate the multiplicative interactions extracted from spectro-temporal regions through the neural networks. We will conduct the experiments on three kinds of classification tasks, i.e., acoustic event/scene classification and speech recognition, while implementing a simple multiplicative module to produce the interactions between features. Our proposed neural networks with multiplicative blocks achieved promising improvements in all tasks, and the experimental results show that the proposed method improved accuracy by 0.45 % in the acoustic event classification, by 2.15 % in the acoustic scene classification, and the phone error rate (PER) by 6.5 % in the phoneme recognition.