{"title":"An improved recursive algorithm for automatic alignment of complex long audio","authors":"He Kejia, Liu Gang, Tang Jie, Guo Jun","doi":"10.1109/ICNIDC.2009.5360838","DOIUrl":null,"url":null,"abstract":"In this paper we present an approach for automatic alignment of long audio data with varied acoustic conditions to their corresponding transcripts in an effective manner. Accurate time-aligned transcripts provide easier access to audio materials by aiding applications such as the indexing, summarizing and retrieving of audio segments. Accurate time alignments are also necessary for labeling the training data for a speech recognizer's acoustic model. We provide an improved recursive technique of speech recognition with a gradually self-adaptive language model and acoustic model.","PeriodicalId":127306,"journal":{"name":"2009 IEEE International Conference on Network Infrastructure and Digital Content","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Network Infrastructure and Digital Content","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNIDC.2009.5360838","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper we present an approach for automatic alignment of long audio data with varied acoustic conditions to their corresponding transcripts in an effective manner. Accurate time-aligned transcripts provide easier access to audio materials by aiding applications such as the indexing, summarizing and retrieving of audio segments. Accurate time alignments are also necessary for labeling the training data for a speech recognizer's acoustic model. We provide an improved recursive technique of speech recognition with a gradually self-adaptive language model and acoustic model.