Speechfind for CDP: Advances in spoken document retrieval for the U. S. collaborative digitization program

Wooil Kim, J. Hansen
{"title":"Speechfind for CDP: Advances in spoken document retrieval for the U. S. collaborative digitization program","authors":"Wooil Kim, J. Hansen","doi":"10.1109/ASRU.2007.4430195","DOIUrl":null,"url":null,"abstract":"This paper presents our recent advances for SpeechFind, a CRSS-UTD designed spoken document retrieval system for the U.S. based Collaborative Digitization Program (CDP). A proto-type of SpeechFind for the CDP is currently serving as the search engine for 1,300 hours of CDP audio content which contain a wide range of acoustic conditions, vocabulary and period selection, and topics. In an effort to determine the amount of user corrected transcripts needed to impact automatic speech recognition (ASR) and audio search, a web-based online interface for verification of ASR-generated transcripts was developed. The procedure for enhancing the transcription performance for SpeechFind is also presented. A selection of adaptation methods for language and acoustic models are employed depending on the acoustics of the corpora under test. Experimental results on the CDP corpus demonstrate that the employed model adaptation scheme using the verified transcripts is effective in improving recognition accuracy. Through a combination of feature/acoustic model enhancement and language model selection, up to 24.8% relative improvement in ASR was obtained. The SpeechFind system, employing automatic transcript generation, online CDP transcript correction, and our transcript reliability estimator, demonstrates a comprehensive support mechanism to ensure reliable transcription and search for U.S. libraries with limited speech technology experience.","PeriodicalId":371729,"journal":{"name":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2007.4430195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

This paper presents our recent advances for SpeechFind, a CRSS-UTD designed spoken document retrieval system for the U.S. based Collaborative Digitization Program (CDP). A proto-type of SpeechFind for the CDP is currently serving as the search engine for 1,300 hours of CDP audio content which contain a wide range of acoustic conditions, vocabulary and period selection, and topics. In an effort to determine the amount of user corrected transcripts needed to impact automatic speech recognition (ASR) and audio search, a web-based online interface for verification of ASR-generated transcripts was developed. The procedure for enhancing the transcription performance for SpeechFind is also presented. A selection of adaptation methods for language and acoustic models are employed depending on the acoustics of the corpora under test. Experimental results on the CDP corpus demonstrate that the employed model adaptation scheme using the verified transcripts is effective in improving recognition accuracy. Through a combination of feature/acoustic model enhancement and language model selection, up to 24.8% relative improvement in ASR was obtained. The SpeechFind system, employing automatic transcript generation, online CDP transcript correction, and our transcript reliability estimator, demonstrates a comprehensive support mechanism to ensure reliable transcription and search for U.S. libraries with limited speech technology experience.
面向CDP的语音检索:美国协作数字化计划的语音文档检索进展
本文介绍了我们为基于美国的协同数字化计划(CDP)设计的基于cross - utd的语音文档检索系统SpeechFind的最新进展。目前,用于CDP的一个原型speech - find正在作为1300小时CDP音频内容的搜索引擎,这些音频内容包含广泛的声学条件、词汇和周期选择以及主题。为了确定影响自动语音识别(ASR)和音频搜索所需的用户更正文本的数量,开发了一个基于web的在线界面,用于验证ASR生成的文本。本文还介绍了提高语音查找转录性能的方法。根据被测语料库的声学特性,选择语言和声学模型的适应方法。在CDP语料库上的实验结果表明,基于验证文本的模型自适应方案能够有效地提高识别精度。通过特征/声学模型增强和语言模型选择相结合,ASR的相对改善率高达24.8%。SpeechFind系统采用自动转录生成、在线CDP转录纠正和我们的转录可靠性估计器,展示了一个全面的支持机制,以确保对语音技术经验有限的美国图书馆的可靠转录和搜索。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信