{"title":"黏着语言的子词语音识别","authors":"Alakbar Valizada","doi":"10.1109/AICT52784.2021.9620466","DOIUrl":null,"url":null,"abstract":"The field of large vocabulary continuous speech recognition has advanced in recent years. Most research has used phonemes and words as speech recognition units. In this work, we introduce and develop syllable-based subword modeling for speech recognition and compare it with word-based speech recognition. Our method suggests adding an additional syllable layer between phone and word. The proposed method tested for the Azerbaijani language. The speech database was collected using mobile devices. The suggested method is very effective for agglutinative language structure. Because syllable count is less than word count, our approach reduces the number of out-of-vocabulary words significantly. Experimental results show that our syllable-based speech recognition method reduces the word error rate by 5%. The suggested method can be applied to other agglutinative languages also, especially for Turkic groups of languages. Experiments show that the proposed method can greatly improve the system accuracy, and also outperform commonly used word-based methods.","PeriodicalId":150606,"journal":{"name":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Subword Speech Recognition for Agglutinative Languages\",\"authors\":\"Alakbar Valizada\",\"doi\":\"10.1109/AICT52784.2021.9620466\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The field of large vocabulary continuous speech recognition has advanced in recent years. Most research has used phonemes and words as speech recognition units. In this work, we introduce and develop syllable-based subword modeling for speech recognition and compare it with word-based speech recognition. Our method suggests adding an additional syllable layer between phone and word. The proposed method tested for the Azerbaijani language. The speech database was collected using mobile devices. The suggested method is very effective for agglutinative language structure. Because syllable count is less than word count, our approach reduces the number of out-of-vocabulary words significantly. Experimental results show that our syllable-based speech recognition method reduces the word error rate by 5%. The suggested method can be applied to other agglutinative languages also, especially for Turkic groups of languages. Experiments show that the proposed method can greatly improve the system accuracy, and also outperform commonly used word-based methods.\",\"PeriodicalId\":150606,\"journal\":{\"name\":\"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AICT52784.2021.9620466\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 15th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT52784.2021.9620466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Subword Speech Recognition for Agglutinative Languages
The field of large vocabulary continuous speech recognition has advanced in recent years. Most research has used phonemes and words as speech recognition units. In this work, we introduce and develop syllable-based subword modeling for speech recognition and compare it with word-based speech recognition. Our method suggests adding an additional syllable layer between phone and word. The proposed method tested for the Azerbaijani language. The speech database was collected using mobile devices. The suggested method is very effective for agglutinative language structure. Because syllable count is less than word count, our approach reduces the number of out-of-vocabulary words significantly. Experimental results show that our syllable-based speech recognition method reduces the word error rate by 5%. The suggested method can be applied to other agglutinative languages also, especially for Turkic groups of languages. Experiments show that the proposed method can greatly improve the system accuracy, and also outperform commonly used word-based methods.