基于书面语言文本的ASR改进语言模型

Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu
{"title":"基于书面语言文本的ASR改进语言模型","authors":"Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu","doi":"10.1109/NCC55593.2022.9806803","DOIUrl":null,"url":null,"abstract":"The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improved Language Models for ASR using Written Language Text\",\"authors\":\"Kaustuv Mukherji, Meghna Pandharipande, Sunil Kumar Kopparapu\",\"doi\":\"10.1109/NCC55593.2022.9806803\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.\",\"PeriodicalId\":403870,\"journal\":{\"name\":\"2022 National Conference on Communications (NCC)\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC55593.2022.9806803\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

自动语音识别(ASR)引擎的性能主要取决于(a)声学模型(AM), (b)语言模型(LM)和(c)词典(Lx),虽然每个块对ASR整体性能的贡献不能单独衡量,但在特定领域的ASR情况下,良好的LM有助于以较小的成本提高性能。一般来说,LM比构建AM更环保,并且对于特定领域的ASR更容易构建,因为它只需要特定领域的文本语料库。传统上,由于其现成的可用性,书面语言文本(WLT)语料库已被用于构建LM,尽管人们一致认为WLT和口语文本(SLT)之间存在显着差异。在本文中,我们探索了可用于将WLT转换为实现更好的LM以支持ASR性能的形式的方法和技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improved Language Models for ASR using Written Language Text
The performance of an Automatic Speech Recognition (ASR) engine primarily depends on ($a$) the acoustic model (AM), (b) the language model (LM) and (c) the lexicon (Lx), While the contribution of each block to the overall performance of an ASR cannot be measured separately, a good LM helps in performance improvement in case of a domain specific ASR at a smaller cost. Generally, LM is greener compared to building AM and is much easier to build, for a domain specific ASR because it requires only domain specific text corpora. Traditionally, because of its ready availability, written language text (WLT) corpora has been used to build LM though there is an agreement that there a significant difference between WLT and spoken language text (SLT). In this paper, we explore methods and techniques that can be used to convert WLT into a form that realizes a better LM to support ASR performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信