A phoneme recognition system using modular construction of time-delay neural networks

Thomas C. Wendell, K. Abdelhamied
{"title":"A phoneme recognition system using modular construction of time-delay neural networks","authors":"Thomas C. Wendell, K. Abdelhamied","doi":"10.1109/CBMS.1992.245041","DOIUrl":null,"url":null,"abstract":"Research on alternative approaches of representing phoneme data to be input into an artificial neural network and alterations that can be made in the network to reduce training time without sacrificing recognition rate is described. A modularly constructed time-delay neural network (TDNN) trained to identify English stop consonant phonemes under speaker-independent, continuous speech conditions was used. Samples of continuous speech were recorded from ten male speakers and phonemes were manually extracted. Extracted phonemes were less than the maximum input size for the TDNN, so data were shifted within the input window to allow recognition of the phoneme regardless of where the phoneme was presented within the input window. The TDNN contained 490 processing elements and over 12000 connections and was constructed in a modular fashion, allowing future expansion. Recognition rates as high as 98.8% were obtained for individual phonemes within modules, and overall recognition rates as high as 79.4% for all stop consonant phonemes were obtained.<<ETX>>","PeriodicalId":197891,"journal":{"name":"[1992] Proceedings Fifth Annual IEEE Symposium on Computer-Based Medical Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1992] Proceedings Fifth Annual IEEE Symposium on Computer-Based Medical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS.1992.245041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Research on alternative approaches of representing phoneme data to be input into an artificial neural network and alterations that can be made in the network to reduce training time without sacrificing recognition rate is described. A modularly constructed time-delay neural network (TDNN) trained to identify English stop consonant phonemes under speaker-independent, continuous speech conditions was used. Samples of continuous speech were recorded from ten male speakers and phonemes were manually extracted. Extracted phonemes were less than the maximum input size for the TDNN, so data were shifted within the input window to allow recognition of the phoneme regardless of where the phoneme was presented within the input window. The TDNN contained 490 processing elements and over 12000 connections and was constructed in a modular fashion, allowing future expansion. Recognition rates as high as 98.8% were obtained for individual phonemes within modules, and overall recognition rates as high as 79.4% for all stop consonant phonemes were obtained.<>
采用模块化结构的时滞神经网络的音素识别系统
研究了将音素数据输入到人工神经网络的替代方法,以及在不牺牲识别率的情况下可以在网络中进行的更改,以减少训练时间。采用模块化构造的时滞神经网络(TDNN),在独立于说话人的连续语音条件下识别英语顿音音素。从10位男性说话者中记录连续语音样本,并人工提取音素。提取的音素小于TDNN的最大输入大小,因此在输入窗口内移动数据以允许识别音素,而不管音素在输入窗口内的位置。TDNN包含490个处理元素和超过12000个连接,并以模块化方式构建,允许未来扩展。模块内单个音素的识别率高达98.8%,所有顿音音素的整体识别率高达79.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信