TED语料库讲座的语言建模和转录

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). Pub Date : 2003-04-06 DOI:10.1109/ICASSP.2003.1198760

Erwin Leeuwis, Marcello Federico, M. Cettolo

{"title":"TED语料库讲座的语言建模和转录","authors":"Erwin Leeuwis, Marcello Federico, M. Cettolo","doi":"10.1109/ICASSP.2003.1198760","DOIUrl":null,"url":null,"abstract":"Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"251 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"70","resultStr":"{\"title\":\"Language modeling and transcription of the TED corpus lectures\",\"authors\":\"Erwin Leeuwis, Marcello Federico, M. Cettolo\",\"doi\":\"10.1109/ICASSP.2003.1198760\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved.\",\"PeriodicalId\":104473,\"journal\":{\"name\":\"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).\",\"volume\":\"251 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"70\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2003.1198760\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2003.1198760","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 70

摘要

无论是在声学方面还是在语言建模方面，转录讲座都是一项具有挑战性的任务。在这项工作中，我们展示了我们对TED语料库演讲的自动转录的第一个结果，该语料库最近由ELRA和LDC发布。特别地，我们把精力集中在语言建模上。基线声学和语言模型分别使用8小时的TED文稿和各种类型的文本:会议记录、演讲文稿和会话演讲文稿。然后，通过利用不同类型的信息来研究语言模型对单个说话者的适应性:演讲的自动抄本、演讲的标题、摘要，最后是论文。在最后一个案例中，WER达到了39.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Language modeling and transcription of the TED corpus lectures

Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).

自引率

0.00%

发文量