Speech recognition using time-warping neural networks

Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop Pub Date : 1991-09-30 DOI:10.1109/NNSP.1991.239508

K. Aikawa

引用次数: 6

Abstract

The author proposes a time-warping neural network (TWNN) for phoneme-based speech recognition. The TWNN is designed to accept phonemes with arbitrary duration, whereas conventional phoneme recognition networks have a fixed-length input window. The purpose of this network is to cope with not only variability of phoneme duration but also time warping in a phoneme. The proposed network is composed of several time-warping units which each have a time-warping function. The TWNN is characterized by time-warping functions embedded between the input layer and the first hidden layer in the network. The proposed network demonstrates higher phoneme recognition accuracy than a baseline recognizer based on conventional feedforward neural networks and linear time alignment. The recognition accuracy is even higher than that achieved with discrete hidden Markov models.<>

查看原文本刊更多论文

语音识别的时间扭曲神经网络

作者提出了一种基于音素的语音识别时间扭曲神经网络(TWNN)。TWNN可以接受任意长度的音素，而传统的音素识别网络只有固定长度的输入窗口。该网络的目的不仅在于应对音素持续时间的变化，还在于应对音素的时间扭曲。该网络由多个时间规整单元组成，每个时间规整单元都具有时间规整功能。TWNN的特点是在网络的输入层和第一隐层之间嵌入时间规整函数。该网络比基于传统前馈神经网络和线性时间对齐的基线识别器具有更高的音素识别精度。识别精度甚至高于离散隐马尔可夫模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop

自引率

0.00%

发文量