IITKGP-MLILSC语言识别语音数据库

Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi
{"title":"IITKGP-MLILSC语言识别语音数据库","authors":"Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi","doi":"10.1109/NCC.2012.6176831","DOIUrl":null,"url":null,"abstract":"In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.","PeriodicalId":178278,"journal":{"name":"2012 National Conference on Communications (NCC)","volume":"291 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":"{\"title\":\"IITKGP-MLILSC speech database for language identification\",\"authors\":\"Sudhamay Maity, Anil Kumar Vuppala, K. S. Sekhara Rao, Dipanjan Nandi\",\"doi\":\"10.1109/NCC.2012.6176831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.\",\"PeriodicalId\":178278,\"journal\":{\"name\":\"2012 National Conference on Communications (NCC)\",\"volume\":\"291 8\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC.2012.6176831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2012.6176831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

摘要

本文介绍了由27种印度语言组成的语音数据库,用于分析语音中存在的语言特定信息。在印度语言背景下,由于缺乏覆盖大多数印度语言的合适的语音语料库,尚未对语言自动识别的各种语音特征和分类模型进行系统的分析。有了这个动机,我们开始了在印度语言开发多语言语料库的任务。本文利用谱特征来研究语言特异信息的存在。用混合频率倒谱系数(MFCCs)和线性预测倒谱系数(LPCCs)表示频谱信息。高斯混合模型(GMMs)的发展是为了捕捉语言特定的信息存在于频谱特征。从说话人依赖和独立两种情况分析了语言识别系统的性能。在说话人依赖和独立环境下,识别性能分别为96%和45%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
IITKGP-MLILSC speech database for language identification
In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信