Efficient recognition of continuously-spoken numbers

Canadian Conference on Electrical and Computer Engineering 2001. Conference Proceedings (Cat. No.01TH8555) Pub Date : 2001-05-13 DOI:10.1109/CCECE.2001.933735

D. O'Shaughnessy, M. Gabrea

{"title":"Efficient recognition of continuously-spoken numbers","authors":"D. O'Shaughnessy, M. Gabrea","doi":"10.1109/CCECE.2001.933735","DOIUrl":null,"url":null,"abstract":"Automatic recognition of continuously-spoken numbers (e.g., telephone or credit card digit sequences) is possible with excellent accuracy, even in applications using telephone lines and serving a large population. However, even such simple recognition tasks suffer decreased performance in adverse conditions, e.g., significant background noise or fading on portable telephone channels. If we further impose significant limitations on the computing resources for the recognition task, then robust efficient speech recognition is still a significant challenge, even for a vocabulary as simple as the digits. Since connected-digit recognition over telephone lines has very practical applications. The amount of computer resources needed for a given level of recognition accuracy is investigated. Rather than use a traditional hidden Markov model approach with cepstral analysis, which is computationally intensive and does not always work well under adverse acoustic conditions, a simpler spectral analysis is used, combined with a segmental approach. The restricted nature of the digit vocabulary allows this simpler approach. High recognition accuracy can be maintained despite a large decrease in both memory and computation.","PeriodicalId":184523,"journal":{"name":"Canadian Conference on Electrical and Computer Engineering 2001. Conference Proceedings (Cat. No.01TH8555)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Conference on Electrical and Computer Engineering 2001. Conference Proceedings (Cat. No.01TH8555)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCECE.2001.933735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic recognition of continuously-spoken numbers (e.g., telephone or credit card digit sequences) is possible with excellent accuracy, even in applications using telephone lines and serving a large population. However, even such simple recognition tasks suffer decreased performance in adverse conditions, e.g., significant background noise or fading on portable telephone channels. If we further impose significant limitations on the computing resources for the recognition task, then robust efficient speech recognition is still a significant challenge, even for a vocabulary as simple as the digits. Since connected-digit recognition over telephone lines has very practical applications. The amount of computer resources needed for a given level of recognition accuracy is investigated. Rather than use a traditional hidden Markov model approach with cepstral analysis, which is computationally intensive and does not always work well under adverse acoustic conditions, a simpler spectral analysis is used, combined with a segmental approach. The restricted nature of the digit vocabulary allows this simpler approach. High recognition accuracy can be maintained despite a large decrease in both memory and computation.

查看原文本刊更多论文

有效识别连续说出的数字

自动识别连续说出的数字(例如，电话或信用卡数字序列)可能具有极高的准确性，即使在使用电话线和服务于大量人口的应用中也是如此。然而，即使是这样简单的识别任务，在不利的条件下也会受到性能下降的影响，例如，在便携式电话信道上有明显的背景噪声或衰落。如果我们进一步对识别任务的计算资源施加重大限制，那么即使对于像数字这样简单的词汇表，鲁棒高效的语音识别仍然是一个重大挑战。由于通过电话线进行的数字识别具有非常实际的应用。研究了给定识别精度水平所需的计算机资源数量。传统的隐马尔可夫模型方法与倒谱分析相结合，计算量大，在不利的声学条件下并不总是工作得很好，而使用更简单的频谱分析与分段方法相结合。数字词汇表的有限性允许这种更简单的方法。尽管在内存和计算上都有较大的减少，但仍能保持较高的识别精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Canadian Conference on Electrical and Computer Engineering 2001. Conference Proceedings (Cat. No.01TH8555)

自引率

0.00%

发文量