{"title":"A phoneme recognition system using modular construction of time-delay neural networks","authors":"Thomas C. Wendell, K. Abdelhamied","doi":"10.1109/CBMS.1992.245041","DOIUrl":null,"url":null,"abstract":"Research on alternative approaches of representing phoneme data to be input into an artificial neural network and alterations that can be made in the network to reduce training time without sacrificing recognition rate is described. A modularly constructed time-delay neural network (TDNN) trained to identify English stop consonant phonemes under speaker-independent, continuous speech conditions was used. Samples of continuous speech were recorded from ten male speakers and phonemes were manually extracted. Extracted phonemes were less than the maximum input size for the TDNN, so data were shifted within the input window to allow recognition of the phoneme regardless of where the phoneme was presented within the input window. The TDNN contained 490 processing elements and over 12000 connections and was constructed in a modular fashion, allowing future expansion. Recognition rates as high as 98.8% were obtained for individual phonemes within modules, and overall recognition rates as high as 79.4% for all stop consonant phonemes were obtained.<<ETX>>","PeriodicalId":197891,"journal":{"name":"[1992] Proceedings Fifth Annual IEEE Symposium on Computer-Based Medical Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1992] Proceedings Fifth Annual IEEE Symposium on Computer-Based Medical Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS.1992.245041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Research on alternative approaches of representing phoneme data to be input into an artificial neural network and alterations that can be made in the network to reduce training time without sacrificing recognition rate is described. A modularly constructed time-delay neural network (TDNN) trained to identify English stop consonant phonemes under speaker-independent, continuous speech conditions was used. Samples of continuous speech were recorded from ten male speakers and phonemes were manually extracted. Extracted phonemes were less than the maximum input size for the TDNN, so data were shifted within the input window to allow recognition of the phoneme regardless of where the phoneme was presented within the input window. The TDNN contained 490 processing elements and over 12000 connections and was constructed in a modular fashion, allowing future expansion. Recognition rates as high as 98.8% were obtained for individual phonemes within modules, and overall recognition rates as high as 79.4% for all stop consonant phonemes were obtained.<>