A hierarchical structure for modeling inter and intra phonetic information for phoneme recognition

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI:10.1109/ASRU.2009.5373272

D. Vásquez, Guillermo Aradilla, R. Gruhn, W. Minker

引用次数: 2

Abstract

In this paper, we present a two-layer hierarchical structure based on neural networks for phoneme recognition. The proposed structure attempts to model only the characteristics within a phoneme, i.e., intra-phonetic information. This differs from other state-of-the-art hierarchical structures where the first layer typically models the intra-phonetic information while the second layer focuses on modeling the contextual (inter-phonetic) information. An advantage of the proposed model is that it can be added to another layer that focuses on the inter-phonetic information. In this paper, we also show that the categorization between intra- and inter-phonetic information also allows to extend other state-of-the-art hierarchical approaches. A phoneme accuracy of 77.89% is achieved on the TIMIT database, which compares favorably to the best results obtained on this database.

查看原文本刊更多论文

一种用于音位识别的语音间和语音内信息建模的层次结构

本文提出了一种基于神经网络的两层层次结构的音素识别方法。所提出的结构试图只模拟音素内的特征，即语音内信息。这与其他最先进的分层结构不同，其中第一层通常建模语音内信息，而第二层侧重于建模上下文(语音间)信息。该模型的一个优点是，它可以添加到另一层，重点关注语音间信息。在本文中，我们还表明，语音内和语音间信息之间的分类也允许扩展其他最先进的分层方法。在TIMIT数据库上获得的音素准确率为77.89%，优于在该数据库上获得的最佳结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量