{"title":"Performance of connected digit recognizers with context-dependent word duration modeling","authors":"O. Kwon, C. Un","doi":"10.1109/APCAS.1996.569264","DOIUrl":null,"url":null,"abstract":"In a Korean connected digit recognizer, insertion and deletion errors amount to about half of the total recognition errors because there exists two monophonemic digits in the Korean language. Previous studies showed that these errors are not corrected even by discriminative training algorithms. To reduce those errors, we propose to model and incorporate context-dependent word duration information directly in a decoding algorithm. Experimental results show that while incorporating duration information in the postprocessing stage does not achieve significant improvements over a baseline system, the proposed method reduces word error rates by as much as 10% for unknown length decoding when the recognizer is trained by the maximum likelihood estimation and generalized probabilistic descent methods. Further simple duration modeling by a bounded uniform distribution shows it is possible to achieve performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution, and hence it is a good compromise between performance and complexity.","PeriodicalId":20507,"journal":{"name":"Proceedings of APCCAS'96 - Asia Pacific Conference on Circuits and Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1996-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of APCCAS'96 - Asia Pacific Conference on Circuits and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCAS.1996.569264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
In a Korean connected digit recognizer, insertion and deletion errors amount to about half of the total recognition errors because there exists two monophonemic digits in the Korean language. Previous studies showed that these errors are not corrected even by discriminative training algorithms. To reduce those errors, we propose to model and incorporate context-dependent word duration information directly in a decoding algorithm. Experimental results show that while incorporating duration information in the postprocessing stage does not achieve significant improvements over a baseline system, the proposed method reduces word error rates by as much as 10% for unknown length decoding when the recognizer is trained by the maximum likelihood estimation and generalized probabilistic descent methods. Further simple duration modeling by a bounded uniform distribution shows it is possible to achieve performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution, and hence it is a good compromise between performance and complexity.