{"title":"哈萨克语语音的Wav2vec2模型微调:基于有限语料库的研究","authors":"Kairatuly Bauyrzhan, Mansurova Madina, Ospan Assel","doi":"10.1109/SIST58284.2023.10223504","DOIUrl":null,"url":null,"abstract":"In this study, we developed a model for automatic recognition of Kazakh speech by fine-tuning the XLSR-Wav2Vec2 pre-trained model to a corpus of Kazakh speech. Our results show that fine-tuning the wav2vec2 model on a small corpus of Kazakh speech allows a significant increase in recognition accuracy. However, larger datasets are needed to further evaluate the effectiveness of this approach. The results of this study contribute to ongoing efforts to improve speech recognition technology for low-resource languages such as Kazakh.","PeriodicalId":367406,"journal":{"name":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","volume":"50 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fine-Tuning the Wav2vec2 Model for Kazakh Speech: A Study on a Limited Corpus\",\"authors\":\"Kairatuly Bauyrzhan, Mansurova Madina, Ospan Assel\",\"doi\":\"10.1109/SIST58284.2023.10223504\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we developed a model for automatic recognition of Kazakh speech by fine-tuning the XLSR-Wav2Vec2 pre-trained model to a corpus of Kazakh speech. Our results show that fine-tuning the wav2vec2 model on a small corpus of Kazakh speech allows a significant increase in recognition accuracy. However, larger datasets are needed to further evaluate the effectiveness of this approach. The results of this study contribute to ongoing efforts to improve speech recognition technology for low-resource languages such as Kazakh.\",\"PeriodicalId\":367406,\"journal\":{\"name\":\"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)\",\"volume\":\"50 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIST58284.2023.10223504\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Smart Information Systems and Technologies (SIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIST58284.2023.10223504","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fine-Tuning the Wav2vec2 Model for Kazakh Speech: A Study on a Limited Corpus
In this study, we developed a model for automatic recognition of Kazakh speech by fine-tuning the XLSR-Wav2Vec2 pre-trained model to a corpus of Kazakh speech. Our results show that fine-tuning the wav2vec2 model on a small corpus of Kazakh speech allows a significant increase in recognition accuracy. However, larger datasets are needed to further evaluate the effectiveness of this approach. The results of this study contribute to ongoing efforts to improve speech recognition technology for low-resource languages such as Kazakh.