Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi
{"title":"IITKGP-MLILSC语言识别语音数据库","authors":"Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi","doi":"10.1109/NCC.2012.6176831","DOIUrl":null,"url":null,"abstract":"In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.","PeriodicalId":178278,"journal":{"name":"2012 National Conference on Communications (NCC)","volume":"291 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":"{\"title\":\"IITKGP-MLILSC speech database for language identification\",\"authors\":\"Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi\",\"doi\":\"10.1109/NCC.2012.6176831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.\",\"PeriodicalId\":178278,\"journal\":{\"name\":\"2012 National Conference on Communications (NCC)\",\"volume\":\"291 8\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC.2012.6176831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2012.6176831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
IITKGP-MLILSC speech database for language identification
In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.