{"title":"Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection","authors":"Jie Gao, Qingwei Zhao, Yonghong Yan, J. Shao","doi":"10.1109/CHINSL.2008.ECP.103","DOIUrl":null,"url":null,"abstract":"This paper examines the system combination issue for syllable-confusion-network (SCN)-based Chinese spoken term detection (STD). System combination for STD usually leads to improvements in accuracy but suffers from increased index size or complicated index structure. This paper explores methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness of the indices. First, a composite SCN is generated using two approaches: lattice combination (The SCN is generated from a combined lattice) and confusion network combination (Two SCNs are combined into one). Then a simple compact index is constructed from this composite SCN by merging cross-system redundant information. The experimental result on a 60-hour corpus shows a relative accuracy improvement of 14.7% is achieved over the baseline syllable-based system. Meanwhile, it reduces the index size by 22.3% compared to the commonly adopted score combination method when achieves comparable accuracy.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 6th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2008.ECP.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
This paper examines the system combination issue for syllable-confusion-network (SCN)-based Chinese spoken term detection (STD). System combination for STD usually leads to improvements in accuracy but suffers from increased index size or complicated index structure. This paper explores methods for efficient combination of a word-based system and a syllable-based system while keeping the compactness of the indices. First, a composite SCN is generated using two approaches: lattice combination (The SCN is generated from a combined lattice) and confusion network combination (Two SCNs are combined into one). Then a simple compact index is constructed from this composite SCN by merging cross-system redundant information. The experimental result on a 60-hour corpus shows a relative accuracy improvement of 14.7% is achieved over the baseline syllable-based system. Meanwhile, it reduces the index size by 22.3% compared to the commonly adopted score combination method when achieves comparable accuracy.