BYBLOS: The BBN continuous speech recognition system

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing Pub Date : 1987-04-06 DOI:10.1109/ICASSP.1987.1169748

Y. Chow, M. O. Dunham, O. Kimball, M. Krasner, G. Kubala, J. Makhoul, P. Price, Salim Roukos, R. Schwartz

{"title":"BYBLOS: The BBN continuous speech recognition system","authors":"Y. Chow, M. O. Dunham, O. Kimball, M. Krasner, G. Kubala, J. Makhoul, P. Price, Salim Roukos, R. Schwartz","doi":"10.1109/ICASSP.1987.1169748","DOIUrl":null,"url":null,"abstract":"In this paper, we describe BYBLOS, the BBN continuous speech recognition system. The system, designed for large vocabulary applications, integrates acoustic, phonetic, lexical, and linguistic knowledge sources to achieve high recognition performance. The basic approach, as described in previous papers [1, 2], makes extensive use of robust context-dependent models of phonetic coarticulation using Hidden Markov Models (HMM). We describe the components of the BYBLOS system, including: signal processing frontend, dictionary, phonetic model training system, word model generator, grammar and decoder. In recognition experiments, we demonstrate consistently high word recognition performance on continuous speech across: speakers, task domains, and grammars of varying complexity. In speaker-dependent mode, where 15 minutes of speech is required for training to a speaker, 98.5% word accuracy has been achieved in continuous speech for a 350-word task, using grammars with perplexity ranging from 30 to 60. With only 15 seconds of training speech we demonstrate performance of 97% using a grammar.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"176","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1987.1169748","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 176

Abstract

In this paper, we describe BYBLOS, the BBN continuous speech recognition system. The system, designed for large vocabulary applications, integrates acoustic, phonetic, lexical, and linguistic knowledge sources to achieve high recognition performance. The basic approach, as described in previous papers [1, 2], makes extensive use of robust context-dependent models of phonetic coarticulation using Hidden Markov Models (HMM). We describe the components of the BYBLOS system, including: signal processing frontend, dictionary, phonetic model training system, word model generator, grammar and decoder. In recognition experiments, we demonstrate consistently high word recognition performance on continuous speech across: speakers, task domains, and grammars of varying complexity. In speaker-dependent mode, where 15 minutes of speech is required for training to a speaker, 98.5% word accuracy has been achieved in continuous speech for a 350-word task, using grammars with perplexity ranging from 30 to 60. With only 15 seconds of training speech we demonstrate performance of 97% using a grammar.

查看原文本刊更多论文

BBN连续语音识别系统

本文介绍了BBN连续语音识别系统BYBLOS。该系统专为大词汇量应用而设计，集成了声学、语音、词汇和语言知识来源，以实现高识别性能。如之前的论文[1,2]所述，基本方法广泛使用隐马尔可夫模型(HMM)的语音协同发音的鲁棒上下文相关模型。介绍了BYBLOS系统的组成部分，包括:信号处理前端、字典、语音模型训练系统、词模型生成器、语法和解码器。在识别实验中，我们在不同复杂程度的说话者、任务域和语法的连续语音中展示了一致的高单词识别性能。在说话人依赖模式下，训练一个说话人需要15分钟的讲话时间，在连续讲话中，使用困惑度从30到60的语法，在350个单词的任务中，单词准确率达到了98.5%。只需15秒的训练演讲，我们就能证明使用语法的表现达到97%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing

自引率

0.00%

发文量