{"title":"Integrating dynamic speech modalities into context decision trees","authors":"C. Fügen, I. Rogina","doi":"10.1109/ICASSP.2000.861810","DOIUrl":null,"url":null,"abstract":"Context decision trees are widely used in the speech recognition community. Besides questions about phonetic classes of a phone's context, questions about their position within a word and questions about the gender of the current speaker have been used so far. In this paper we additionally incorporate questions about current modalities of the spoken utterance like the speaker's dialect, the speaking rate, the signal to noise ratio, the latter two of which may change while speaking one utterance. We present a framework that treats all these modalities in a uniform way. Experiments with the Janus speech recognizer have produced error rate reductions of up to 10% when compared to systems that do not use modality questions.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2000.861810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
Context decision trees are widely used in the speech recognition community. Besides questions about phonetic classes of a phone's context, questions about their position within a word and questions about the gender of the current speaker have been used so far. In this paper we additionally incorporate questions about current modalities of the spoken utterance like the speaker's dialect, the speaking rate, the signal to noise ratio, the latter two of which may change while speaking one utterance. We present a framework that treats all these modalities in a uniform way. Experiments with the Janus speech recognizer have produced error rate reductions of up to 10% when compared to systems that do not use modality questions.