{"title":"基于人工智能深度学习的语音识别方法——以BLSTM-CTC模型为例","authors":"Kangyu Chen, Zhiyuan Peng","doi":"10.1145/3606193.3606201","DOIUrl":null,"url":null,"abstract":"Under the influence of information, network and intelligent high-speed development situation, China's intelligent technology and other aspects have made great progress and achievements, derived a lot of advanced artificial intelligence technology, machine learning technology and deep learning technology, etc., to promote the development of intelligence and information in major fields. Artificial intelligence deep learning is the fusion of artificial intelligence technology and machine learning technology, which lays the foundation for the reform and innovation of artificial voice intelligent recognition technology and intelligent robot technology. So in order to improve the application level of intelligent speech recognition technology, it is necessary to continuously optimize the speech recognition method based on AI deep learning. In this regard, according to the relevant literature, this paper addresses the problem that phoneme features of varying duration are generated during the propagation of speech signals, and these features affect the correct rate of speech recognition, and the phoneme features of different lengths are standardized based on the deep learning research mentioned in this paper with BLSTM-CTC as an example. By evaluating the model on the Thchs30 and ST-CMDS datasets, the results show that the MCFN-based BLSTM-CTC speech recognition model has a reduced recognition word error rate compared with the traditional speech recognition model.","PeriodicalId":292243,"journal":{"name":"Proceedings of the 2023 5th International Symposium on Signal Processing Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speech Recognition Method Based on Deep Learning of Artificial Intelligence: An example of BLSTM-CTC model\",\"authors\":\"Kangyu Chen, Zhiyuan Peng\",\"doi\":\"10.1145/3606193.3606201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Under the influence of information, network and intelligent high-speed development situation, China's intelligent technology and other aspects have made great progress and achievements, derived a lot of advanced artificial intelligence technology, machine learning technology and deep learning technology, etc., to promote the development of intelligence and information in major fields. Artificial intelligence deep learning is the fusion of artificial intelligence technology and machine learning technology, which lays the foundation for the reform and innovation of artificial voice intelligent recognition technology and intelligent robot technology. So in order to improve the application level of intelligent speech recognition technology, it is necessary to continuously optimize the speech recognition method based on AI deep learning. In this regard, according to the relevant literature, this paper addresses the problem that phoneme features of varying duration are generated during the propagation of speech signals, and these features affect the correct rate of speech recognition, and the phoneme features of different lengths are standardized based on the deep learning research mentioned in this paper with BLSTM-CTC as an example. By evaluating the model on the Thchs30 and ST-CMDS datasets, the results show that the MCFN-based BLSTM-CTC speech recognition model has a reduced recognition word error rate compared with the traditional speech recognition model.\",\"PeriodicalId\":292243,\"journal\":{\"name\":\"Proceedings of the 2023 5th International Symposium on Signal Processing Systems\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 5th International Symposium on Signal Processing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3606193.3606201\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 5th International Symposium on Signal Processing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3606193.3606201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech Recognition Method Based on Deep Learning of Artificial Intelligence: An example of BLSTM-CTC model
Under the influence of information, network and intelligent high-speed development situation, China's intelligent technology and other aspects have made great progress and achievements, derived a lot of advanced artificial intelligence technology, machine learning technology and deep learning technology, etc., to promote the development of intelligence and information in major fields. Artificial intelligence deep learning is the fusion of artificial intelligence technology and machine learning technology, which lays the foundation for the reform and innovation of artificial voice intelligent recognition technology and intelligent robot technology. So in order to improve the application level of intelligent speech recognition technology, it is necessary to continuously optimize the speech recognition method based on AI deep learning. In this regard, according to the relevant literature, this paper addresses the problem that phoneme features of varying duration are generated during the propagation of speech signals, and these features affect the correct rate of speech recognition, and the phoneme features of different lengths are standardized based on the deep learning research mentioned in this paper with BLSTM-CTC as an example. By evaluating the model on the Thchs30 and ST-CMDS datasets, the results show that the MCFN-based BLSTM-CTC speech recognition model has a reduced recognition word error rate compared with the traditional speech recognition model.