{"title":"希伯来语的自我监督学习——从模型到实践框架","authors":"O. Gal, Rafi Michaeli, Y. Doytsher","doi":"10.14738/tmlai.106.13515","DOIUrl":null,"url":null,"abstract":"In this paper, we present the current state-of-the-art models for Automatic Speech Recognition due to a self-supervised training implemented on Hebrew language. The motivation behind using self-supervised learning is that even though we wouldn't probably get the accuracy rates as if we choose a supervised learning, we still can achieve amazing results with relatively low amount of data. This way of training allows us to train a model on unlabeled data (or to use a pre-trained model, which is always more accessible. It’s goal in the first unsupervised phase is to learn some good representations from raw audio samples, which can be useful for speech recognition tasks, without using any label data. Then, the model can be fine-tuned on a particular dataset for a specific purpose. It means that our involvement really occurs in the last layers of the model. This kind of training proved to be very powerful. We present complete framework from model to practice with simulations and training model and present an impressive result on Hebrew.","PeriodicalId":119801,"journal":{"name":"Transactions on Machine Learning and Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-Supervised Learning in Hebrew–Model to Practice Framework\",\"authors\":\"O. Gal, Rafi Michaeli, Y. Doytsher\",\"doi\":\"10.14738/tmlai.106.13515\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present the current state-of-the-art models for Automatic Speech Recognition due to a self-supervised training implemented on Hebrew language. The motivation behind using self-supervised learning is that even though we wouldn't probably get the accuracy rates as if we choose a supervised learning, we still can achieve amazing results with relatively low amount of data. This way of training allows us to train a model on unlabeled data (or to use a pre-trained model, which is always more accessible. It’s goal in the first unsupervised phase is to learn some good representations from raw audio samples, which can be useful for speech recognition tasks, without using any label data. Then, the model can be fine-tuned on a particular dataset for a specific purpose. It means that our involvement really occurs in the last layers of the model. This kind of training proved to be very powerful. We present complete framework from model to practice with simulations and training model and present an impressive result on Hebrew.\",\"PeriodicalId\":119801,\"journal\":{\"name\":\"Transactions on Machine Learning and Artificial Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Machine Learning and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14738/tmlai.106.13515\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Machine Learning and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14738/tmlai.106.13515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Self-Supervised Learning in Hebrew–Model to Practice Framework
In this paper, we present the current state-of-the-art models for Automatic Speech Recognition due to a self-supervised training implemented on Hebrew language. The motivation behind using self-supervised learning is that even though we wouldn't probably get the accuracy rates as if we choose a supervised learning, we still can achieve amazing results with relatively low amount of data. This way of training allows us to train a model on unlabeled data (or to use a pre-trained model, which is always more accessible. It’s goal in the first unsupervised phase is to learn some good representations from raw audio samples, which can be useful for speech recognition tasks, without using any label data. Then, the model can be fine-tuned on a particular dataset for a specific purpose. It means that our involvement really occurs in the last layers of the model. This kind of training proved to be very powerful. We present complete framework from model to practice with simulations and training model and present an impressive result on Hebrew.