Speaker-independent voiced-stop-consonant recognition using a block-windowed neural network architecture

1993 (25th) Southeastern Symposium on System Theory Pub Date : 1993-03-07 DOI:10.1109/SSST.1993.522811

B.D. Bryant, J. Gowdy

引用次数: 1

Abstract

The authors study several of the more well-known connectionist models, and how they address the time and frequency variability of the multispeaker, voiced-stop-consonant recognition task. Among the network architectures reviewed or tested for were the self-organizing feature maps (SOFM) architecture, various derivatives of this architecture, the time-delay neural network (TDNN) architecture, various derivatives of this architecture, and two frequency-and-time-shift-invariant architectures, frequency-shift-invariant TDNN, and the block-windowed neural network (FTDNN and BWNN). Voiced-stop speech was extracted from up to four dialect regions of the TIMIT continuous speech corpus for subsequent preprocessing and training and testing of network instances. Various feature representations were tested for their robustness in representing the voiced-stop consonants.

查看原文本刊更多论文

基于块窗口神经网络结构的独立于说话人的语音-停顿辅音识别

作者研究了几个更著名的连接主义模型，以及它们如何解决多说话者的时间和频率变化，语音-停止-辅音识别任务。在审查或测试的网络体系结构中，有自组织特征映射(SOFM)体系结构，该体系结构的各种衍生产品，时延神经网络(TDNN)体系结构，该体系结构的各种衍生产品，以及两种频率和时移不变体系结构，频移不变TDNN和块窗神经网络(FTDNN和BWNN)。从TIMIT连续语音语料库中提取多达四个方言区域的停顿语音，进行后续预处理和网络实例的训练和测试。测试了各种特征表征在表示顿音辅音方面的稳健性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

1993 (25th) Southeastern Symposium on System Theory

自引率

0.00%

发文量