{"title":"RM-Transformer:一个基于transformer的普通话语音识别模型","authors":"Xingmin Lu, Jianguo Hu, Shenhao Li, Yanyu Ding","doi":"10.1109/CCAI55564.2022.9807706","DOIUrl":null,"url":null,"abstract":"A network called RM-Transformer is proposed in this paper for Mandarin speech recognition. The proposed RMTransformer can make full use of features from different layers in the network instead of features solely from the top layer, which is used in the traditional models. Moreover, the proposed network has excellent capability in addressing the ambiguity problems caused by homophone phenomenon in Mandarin speech recognition task. Empirical evaluations have been conducted in two widely used datasets, which are Aishell-l and Aidatatang-200zh. Experimental results can verify the effectiveness of the proposed scheme.","PeriodicalId":340195,"journal":{"name":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RM-Transformer: A Transformer-based Model for Mandarin Speech Recognition\",\"authors\":\"Xingmin Lu, Jianguo Hu, Shenhao Li, Yanyu Ding\",\"doi\":\"10.1109/CCAI55564.2022.9807706\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A network called RM-Transformer is proposed in this paper for Mandarin speech recognition. The proposed RMTransformer can make full use of features from different layers in the network instead of features solely from the top layer, which is used in the traditional models. Moreover, the proposed network has excellent capability in addressing the ambiguity problems caused by homophone phenomenon in Mandarin speech recognition task. Empirical evaluations have been conducted in two widely used datasets, which are Aishell-l and Aidatatang-200zh. Experimental results can verify the effectiveness of the proposed scheme.\",\"PeriodicalId\":340195,\"journal\":{\"name\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCAI55564.2022.9807706\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAI55564.2022.9807706","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RM-Transformer: A Transformer-based Model for Mandarin Speech Recognition
A network called RM-Transformer is proposed in this paper for Mandarin speech recognition. The proposed RMTransformer can make full use of features from different layers in the network instead of features solely from the top layer, which is used in the traditional models. Moreover, the proposed network has excellent capability in addressing the ambiguity problems caused by homophone phenomenon in Mandarin speech recognition task. Empirical evaluations have been conducted in two widely used datasets, which are Aishell-l and Aidatatang-200zh. Experimental results can verify the effectiveness of the proposed scheme.