N. Tomashenko, Kevin Vythelingum, Anthony Rousseau, Y. Estève
{"title":"LIUM ASR systems for the 2016 Multi-Genre Broadcast Arabic challenge","authors":"N. Tomashenko, Kevin Vythelingum, Anthony Rousseau, Y. Estève","doi":"10.1109/SLT.2016.7846278","DOIUrl":null,"url":null,"abstract":"This paper describes the automatic speech recognition (ASR) systems developed by LIUM in the framework of the 2016 Multi-Genre Broadcast (MGB-2) Challenge in the Arabic language. LIUM participated in the first of the two proposed tasks, namely the speech-to-text transcription of Aljazeera recordings. We present the approaches and details found in our systems, as well as our results in the evaluation campaign: the primary LIUM ASR system attained the second position. The main aspects come from the use of GMM-derived features for training a DNN, combined with the use of time-delay neural networks for acoustic models, the use of two different approaches in order to automatically phonetize Arabic words, and finally, the training data selection strategy for acoustic and language models.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846278","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
This paper describes the automatic speech recognition (ASR) systems developed by LIUM in the framework of the 2016 Multi-Genre Broadcast (MGB-2) Challenge in the Arabic language. LIUM participated in the first of the two proposed tasks, namely the speech-to-text transcription of Aljazeera recordings. We present the approaches and details found in our systems, as well as our results in the evaluation campaign: the primary LIUM ASR system attained the second position. The main aspects come from the use of GMM-derived features for training a DNN, combined with the use of time-delay neural networks for acoustic models, the use of two different approaches in order to automatically phonetize Arabic words, and finally, the training data selection strategy for acoustic and language models.