R. Vuddagiri, K. Gurugubelli, P. Jain, Hari Krishna Vydana, A. Vuppala
{"title":"IIITH-ILSC Speech Database for Indain Language Identification","authors":"R. Vuddagiri, K. Gurugubelli, P. Jain, Hari Krishna Vydana, A. Vuppala","doi":"10.21437/SLTU.2018-12","DOIUrl":null,"url":null,"abstract":"This work focuses on the development of speech data comprising 23 Indian languages for developing language identification (LID) systems. Large data is a pre-requisite for developing state-of-the-art LID systems. With this motivation, the task of developing multilingual speech corpus for Indian languages has been initiated. This paper describes the composition of the data and the performances of various LID systems developed using this data. In this paper, Mel frequency cepstral feature representation is used for language identification. In this work, various state-of-the-art LID systems are developed using i-vectors, deep neural network (DNN) and deep neural network with attention (DNN-WA) models. The performance of the LID system is observed in terms of the equal error rate for i-vector, DNN and DNN-WA is 17.77%, 17.95%, and 15.18% respec-tively. Deep neural network with attention model shows a better performance over i-vector and DNN models.","PeriodicalId":190269,"journal":{"name":"Workshop on Spoken Language Technologies for Under-resourced Languages","volume":"86 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Spoken Language Technologies for Under-resourced Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/SLTU.2018-12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
This work focuses on the development of speech data comprising 23 Indian languages for developing language identification (LID) systems. Large data is a pre-requisite for developing state-of-the-art LID systems. With this motivation, the task of developing multilingual speech corpus for Indian languages has been initiated. This paper describes the composition of the data and the performances of various LID systems developed using this data. In this paper, Mel frequency cepstral feature representation is used for language identification. In this work, various state-of-the-art LID systems are developed using i-vectors, deep neural network (DNN) and deep neural network with attention (DNN-WA) models. The performance of the LID system is observed in terms of the equal error rate for i-vector, DNN and DNN-WA is 17.77%, 17.95%, and 15.18% respec-tively. Deep neural network with attention model shows a better performance over i-vector and DNN models.