Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion

J. Huang, D. Powers
{"title":"Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion","authors":"J. Huang, D. Powers","doi":"10.1109/IALP.2011.46","DOIUrl":null,"url":null,"abstract":"The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Asian Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2011.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.
基于错误驱动的汉语拼音字符转换自适应语言建模
当训练数据和转换数据的特征不同时,会严重影响汉字拼音转换的性能。由于自然语言具有高度的可变性和不确定性,不可能建立一个完整的、通用的语言模型来适应所有的任务。传统的自适应MAP模型使用混合系数将任务独立数据与任务相关数据混合,但无法预测用户的语言风格和新领域的出现。提出了一种统计误差驱动的自适应汉语拼音输入系统语言建模方法。当在拼音到字符转换期间发生错误时,可以逐步调整此模型。它显著提高了拼音到字符的转换率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信