P. Kozierski, Talar Sadalla, S. Drgas, A. Dabrowski, Joanna Zietkiewicz
{"title":"词汇量和语言模型顺序对波兰语语音识别的影响","authors":"P. Kozierski, Talar Sadalla, S. Drgas, A. Dabrowski, Joanna Zietkiewicz","doi":"10.1109/MMAR.2017.8046899","DOIUrl":null,"url":null,"abstract":"The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. The aim of studies presented in this paper was to check, how the vocabulary size and the language model order influence on the speech recognition quality. It has been concluded that even using recordings with 5,000 different words only it is possible to prepare large vocabulary continuous speech recognition (LVCSR) model. It has been also found that the third order of language model is the best choice. The difference between normal and whispery speech is negligible and is manifested only in higher word error rate index (about 1.5 times higher for whispery speech).","PeriodicalId":189753,"journal":{"name":"2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)","volume":"305 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The impact of vocabulary size and language model order on the polish whispery speech recognition\",\"authors\":\"P. Kozierski, Talar Sadalla, S. Drgas, A. Dabrowski, Joanna Zietkiewicz\",\"doi\":\"10.1109/MMAR.2017.8046899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. The aim of studies presented in this paper was to check, how the vocabulary size and the language model order influence on the speech recognition quality. It has been concluded that even using recordings with 5,000 different words only it is possible to prepare large vocabulary continuous speech recognition (LVCSR) model. It has been also found that the third order of language model is the best choice. The difference between normal and whispery speech is negligible and is manifested only in higher word error rate index (about 1.5 times higher for whispery speech).\",\"PeriodicalId\":189753,\"journal\":{\"name\":\"2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"volume\":\"305 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMAR.2017.8046899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 22nd International Conference on Methods and Models in Automation and Robotics (MMAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMAR.2017.8046899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The impact of vocabulary size and language model order on the polish whispery speech recognition
The article presents studies on the automatic whispery speech recognition. In the performed research a new corpus with whispery speech has been used. The aim of studies presented in this paper was to check, how the vocabulary size and the language model order influence on the speech recognition quality. It has been concluded that even using recordings with 5,000 different words only it is possible to prepare large vocabulary continuous speech recognition (LVCSR) model. It has been also found that the third order of language model is the best choice. The difference between normal and whispery speech is negligible and is manifested only in higher word error rate index (about 1.5 times higher for whispery speech).