低资源语言中元音识别的声道长度归一化

2014 International Conference on Asian Language Processing (IALP) Pub Date : 2014-10-01 DOI:10.1109/IALP.2014.6973516

Shubham Sharma, Maulik C. Madhavi, H. Patil

{"title":"低资源语言中元音识别的声道长度归一化","authors":"Shubham Sharma, Maulik C. Madhavi, H. Patil","doi":"10.1109/IALP.2014.6973516","DOIUrl":null,"url":null,"abstract":"Vocal Tract Length Normalization (VTLN) is used to design vocal tract length normalized Automatic Speech Recognition (ASR) systems. It has led to improvement in the performance of ASR systems by taking into account the physiological differences among speakers. Recently, a number of speech recognition applications are being developed for Indian languages. In this paper, we use state-of-the-art method for VTLN based on maximum likelihood approach. A vowel recognition system has been developed for two low resourced Indian languages, viz., Gujarati and Marathi. Appropriate warping factors have been obtained for all speakers considered for training and testing procedures. An improvement in the performance of vowel recognition is observed as compared to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC).","PeriodicalId":117334,"journal":{"name":"2014 International Conference on Asian Language Processing (IALP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Vocal tract length normalization for vowel recognition in low resource languages\",\"authors\":\"Shubham Sharma, Maulik C. Madhavi, H. Patil\",\"doi\":\"10.1109/IALP.2014.6973516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Vocal Tract Length Normalization (VTLN) is used to design vocal tract length normalized Automatic Speech Recognition (ASR) systems. It has led to improvement in the performance of ASR systems by taking into account the physiological differences among speakers. Recently, a number of speech recognition applications are being developed for Indian languages. In this paper, we use state-of-the-art method for VTLN based on maximum likelihood approach. A vowel recognition system has been developed for two low resourced Indian languages, viz., Gujarati and Marathi. Appropriate warping factors have been obtained for all speakers considered for training and testing procedures. An improvement in the performance of vowel recognition is observed as compared to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC).\",\"PeriodicalId\":117334,\"journal\":{\"name\":\"2014 International Conference on Asian Language Processing (IALP)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2014.6973516\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2014.6973516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

声道长度归一化(VTLN)用于设计声道长度归一化的自动语音识别(ASR)系统。它通过考虑说话者之间的生理差异而改善了ASR系统的性能。最近，一些针对印度语言的语音识别应用程序正在开发中。在本文中，我们使用基于极大似然方法的最先进的VTLN方法。为两种资源贫乏的印度语言，即古吉拉特语和马拉地语，开发了一个元音识别系统。已为所有考虑培训和测试程序的扬声器获得适当的翘曲因素。与最先进的Mel频率倒谱系数(MFCC)相比，元音识别性能有所改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vocal tract length normalization for vowel recognition in low resource languages

Vocal Tract Length Normalization (VTLN) is used to design vocal tract length normalized Automatic Speech Recognition (ASR) systems. It has led to improvement in the performance of ASR systems by taking into account the physiological differences among speakers. Recently, a number of speech recognition applications are being developed for Indian languages. In this paper, we use state-of-the-art method for VTLN based on maximum likelihood approach. A vowel recognition system has been developed for two low resourced Indian languages, viz., Gujarati and Marathi. Appropriate warping factors have been obtained for all speakers considered for training and testing procedures. An improvement in the performance of vowel recognition is observed as compared to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Conference on Asian Language Processing (IALP)

自引率

0.00%

发文量