Vocal tract length normalization for vowel recognition in low resource languages

2014 International Conference on Asian Language Processing (IALP) Pub Date : 2014-10-01 DOI:10.1109/IALP.2014.6973516

Shubham Sharma, Maulik C. Madhavi, H. Patil

引用次数: 1

Abstract

Vocal Tract Length Normalization (VTLN) is used to design vocal tract length normalized Automatic Speech Recognition (ASR) systems. It has led to improvement in the performance of ASR systems by taking into account the physiological differences among speakers. Recently, a number of speech recognition applications are being developed for Indian languages. In this paper, we use state-of-the-art method for VTLN based on maximum likelihood approach. A vowel recognition system has been developed for two low resourced Indian languages, viz., Gujarati and Marathi. Appropriate warping factors have been obtained for all speakers considered for training and testing procedures. An improvement in the performance of vowel recognition is observed as compared to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC).

查看原文本刊更多论文

低资源语言中元音识别的声道长度归一化

声道长度归一化(VTLN)用于设计声道长度归一化的自动语音识别(ASR)系统。它通过考虑说话者之间的生理差异而改善了ASR系统的性能。最近，一些针对印度语言的语音识别应用程序正在开发中。在本文中，我们使用基于极大似然方法的最先进的VTLN方法。为两种资源贫乏的印度语言，即古吉拉特语和马拉地语，开发了一个元音识别系统。已为所有考虑培训和测试程序的扬声器获得适当的翘曲因素。与最先进的Mel频率倒谱系数(MFCC)相比，元音识别性能有所改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 International Conference on Asian Language Processing (IALP)

自引率

0.00%

发文量