{"title":"The IBM 2007 speech transcription system for European parliamentary speeches","authors":"B. Ramabhadran, O. Siohan, A. Sethy","doi":"10.1109/ASRU.2007.4430158","DOIUrl":null,"url":null,"abstract":"TC-STAR is an European Union funded speech to speech translation project to transcribe, translate and synthesize European Parliamentary Plenary Speeches (EPPS). This paper describes IBM's English speech recognition system submitted to the TC-STAR 2007 Evaluation. Language model adaptation based on clustering and data selection using relative entropy minimization provided significant gains in the 2007 evaluation. The additional advances over the 2006 system that we present in this paper include unsupervised training of acoustic and language models; a system architecture that is based on cross-adaptation across complementary systems and system combination through generation of an ensemble of systems using randomized decision tree state-tying. These advances reduced the error rate by 30% relative over the best-performing system in the TC-STAR 2006 evaluation on the 2006 English development and evaluation test sets, and produced one of the best performing systems on the 2007 evaluation in English with a word error rate of 7.1%.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43
Abstract
TC-STAR is an European Union funded speech to speech translation project to transcribe, translate and synthesize European Parliamentary Plenary Speeches (EPPS). This paper describes IBM's English speech recognition system submitted to the TC-STAR 2007 Evaluation. Language model adaptation based on clustering and data selection using relative entropy minimization provided significant gains in the 2007 evaluation. The additional advances over the 2006 system that we present in this paper include unsupervised training of acoustic and language models; a system architecture that is based on cross-adaptation across complementary systems and system combination through generation of an ensemble of systems using randomized decision tree state-tying. These advances reduced the error rate by 30% relative over the best-performing system in the TC-STAR 2006 evaluation on the 2006 English development and evaluation test sets, and produced one of the best performing systems on the 2007 evaluation in English with a word error rate of 7.1%.