语音自动识别中语音特征的互补利用

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430082

P. MomayyezSiahkal, James Waterhouse, R. Rose

{"title":"语音自动识别中语音特征的互补利用","authors":"P. MomayyezSiahkal, James Waterhouse, R. Rose","doi":"10.1109/ASRU.2007.4430082","DOIUrl":null,"url":null,"abstract":"This paper presents techniques for exploiting complementary information contained in multiple definitions of phonological feature systems. Three different feature systems, differing in their structure and in the acoustic phonetic features they represent, are considered. A two stage process involving a mechanism for frame level phonological feature detection and a mechanism for decoding phoneme sequences from features is implemented for each phonological feature system. Two methods are investigated for integrating these features with MFCC based ASR systems. First, phonological feature and MFCC based systems are combined in a lattice re-scoring paradigm. Second, confusion network based system combination (CNC) is used to combine phone networks derived from phonological distinctive feature (PDF) and MFCC based systems. It is shown, using both methods, that phone error rates can be reduced by as much as 15% relative to the phone error rates obtained for any individual feature stream.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Exploiting complementary aspects of phonological features in automatic speech recognition\",\"authors\":\"P. MomayyezSiahkal, James Waterhouse, R. Rose\",\"doi\":\"10.1109/ASRU.2007.4430082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents techniques for exploiting complementary information contained in multiple definitions of phonological feature systems. Three different feature systems, differing in their structure and in the acoustic phonetic features they represent, are considered. A two stage process involving a mechanism for frame level phonological feature detection and a mechanism for decoding phoneme sequences from features is implemented for each phonological feature system. Two methods are investigated for integrating these features with MFCC based ASR systems. First, phonological feature and MFCC based systems are combined in a lattice re-scoring paradigm. Second, confusion network based system combination (CNC) is used to combine phone networks derived from phonological distinctive feature (PDF) and MFCC based systems. It is shown, using both methods, that phone error rates can be reduced by as much as 15% relative to the phone error rates obtained for any individual feature stream.\",\"PeriodicalId\":371729,\"journal\":{\"name\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2007.4430082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

本文介绍了利用语音特征系统的多个定义中包含的互补信息的技术。考虑了三种不同的特征系统，它们的结构和所代表的声学语音特征不同。对每个音位特征系统实现了两阶段过程，包括帧级音位特征检测机制和从特征中解码音位序列的机制。研究了将这些特征与基于MFCC的ASR系统集成的两种方法。首先，将语音特征和基于MFCC的系统结合在晶格重新评分范式中。其次，采用基于混淆网络的系统组合(CNC)，将语音显著特征(PDF)衍生的电话网络与基于MFCC的电话网络进行组合。结果表明，使用这两种方法，相对于任何单个特征流获得的电话错误率，电话错误率可以降低多达15%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploiting complementary aspects of phonological features in automatic speech recognition

This paper presents techniques for exploiting complementary information contained in multiple definitions of phonological feature systems. Three different feature systems, differing in their structure and in the acoustic phonetic features they represent, are considered. A two stage process involving a mechanism for frame level phonological feature detection and a mechanism for decoding phoneme sequences from features is implemented for each phonological feature system. Two methods are investigated for integrating these features with MFCC based ASR systems. First, phonological feature and MFCC based systems are combined in a lattice re-scoring paradigm. Second, confusion network based system combination (CNC) is used to combine phone networks derived from phonological distinctive feature (PDF) and MFCC based systems. It is shown, using both methods, that phone error rates can be reduced by as much as 15% relative to the phone error rates obtained for any individual feature stream.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量