Alignment Analysis of English Chinese Bilingual Corpora based on CRFS Model

Miao Wu
{"title":"Alignment Analysis of English Chinese Bilingual Corpora based on CRFS Model","authors":"Miao Wu","doi":"10.1109/acait53529.2021.9731293","DOIUrl":null,"url":null,"abstract":"Bilingual corpus alignment analysis, as an important research method of machine translation, is of great value as a source of translation knowledge. In view of this, this study carries out the research on English Chinese bilingual corpus alignment based on CRFs model. That is, on the basis of chunk aligned corpus, combined with the respective language characteristics of English and Chinese, the word alignment between chunks is realized with the help of CRFs model. The results show that the accuracy and recall of maximum entropy are 45.77% and 45.58% respectively, the accuracy and recall of log linear model are 45.73% and 45.68% respectively, while the accuracy and recall of CRFs model are 47.99% and 47.94%. It indicates that CRFs model can effectively alleviate the asymmetric problem of English and Chinese database alignment, also the alignment error rate of CRFs model is reduced accordingly. Therefore, it has a good effect on the alignment of English and Chinese bilingual corpora.","PeriodicalId":173633,"journal":{"name":"2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/acait53529.2021.9731293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Bilingual corpus alignment analysis, as an important research method of machine translation, is of great value as a source of translation knowledge. In view of this, this study carries out the research on English Chinese bilingual corpus alignment based on CRFs model. That is, on the basis of chunk aligned corpus, combined with the respective language characteristics of English and Chinese, the word alignment between chunks is realized with the help of CRFs model. The results show that the accuracy and recall of maximum entropy are 45.77% and 45.58% respectively, the accuracy and recall of log linear model are 45.73% and 45.68% respectively, while the accuracy and recall of CRFs model are 47.99% and 47.94%. It indicates that CRFs model can effectively alleviate the asymmetric problem of English and Chinese database alignment, also the alignment error rate of CRFs model is reduced accordingly. Therefore, it has a good effect on the alignment of English and Chinese bilingual corpora.
基于CRFS模型的英汉双语语料库对齐分析
双语语料库比对分析作为机器翻译的一种重要研究方法,作为翻译知识的来源具有重要的价值。鉴于此,本研究开展了基于CRFs模型的英汉双语语料库对齐研究。即在块对齐语料库的基础上,结合英汉两种语言各自的语言特征,借助CRFs模型实现块之间的词对齐。结果表明,最大熵模型的准确率和召回率分别为45.77%和45.58%,对数线性模型的准确率和召回率分别为45.73%和45.68%,而CRFs模型的准确率和召回率分别为47.99%和47.94%。结果表明,该模型能有效缓解中英文数据库对齐的不对称问题,相应降低了CRFs模型的对齐错误率。因此,它对英汉双语语料库的对齐有很好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信