{"title":"Robust Syllable Segmentation Of Continuous Speech Using Neural Networks","authors":"A. Noetzel","doi":"10.1109/ELECTR.1991.718279","DOIUrl":null,"url":null,"abstract":"We describe a multilayered a neural network structure for continuous speech recognition, based on the isolation and identification of syllables. The first layer is a neural network, trained by unsupervised learning, that detects syllable boundaries and provides a representation of the phonetic content of each syllable. The next layer provides a phonernic representation of the syllable. Each cell of the third layer represents a particular syllable. Multiple cell activations at this layer represent the syllables of an utterance: a phrase or a multisyllabic word. The temporal-discriminant cell, whose activation depends on the sequence of activations at its inputs, is used to disambiguate the pattern in the syllable-cell layer. Each cell of the the fourth layer represents a particular word or phrase. Because a syllable cannot be precisely defined in phonetic terms, and because of the variations of articulation and the boundary effects of adjoining words, different syllables will be identified in different utterances of a word. The neural network structure presented here has a procedure for incorporating alternate representations of words, based on the variations of syllabification that occur in connected speech. The procedure is activated by the misrecognition of a particular word or phrase during supervised learning. A broad class of alternate syllabifications, including the migration of a consonant from syllable-final to syllable-initial position (of the following syllable), are encompassed by a single training step. The learning procedure is demonstrated through simple examples.","PeriodicalId":339281,"journal":{"name":"Electro International, 1991","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electro International, 1991","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ELECTR.1991.718279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
We describe a multilayered a neural network structure for continuous speech recognition, based on the isolation and identification of syllables. The first layer is a neural network, trained by unsupervised learning, that detects syllable boundaries and provides a representation of the phonetic content of each syllable. The next layer provides a phonernic representation of the syllable. Each cell of the third layer represents a particular syllable. Multiple cell activations at this layer represent the syllables of an utterance: a phrase or a multisyllabic word. The temporal-discriminant cell, whose activation depends on the sequence of activations at its inputs, is used to disambiguate the pattern in the syllable-cell layer. Each cell of the the fourth layer represents a particular word or phrase. Because a syllable cannot be precisely defined in phonetic terms, and because of the variations of articulation and the boundary effects of adjoining words, different syllables will be identified in different utterances of a word. The neural network structure presented here has a procedure for incorporating alternate representations of words, based on the variations of syllabification that occur in connected speech. The procedure is activated by the misrecognition of a particular word or phrase during supervised learning. A broad class of alternate syllabifications, including the migration of a consonant from syllable-final to syllable-initial position (of the following syllable), are encompassed by a single training step. The learning procedure is demonstrated through simple examples.