Comparing a Hand-crafted to an Automatically Generated Feature Set for Deep Learning: Pairwise Translation Evaluation

Despoina Mouratidis, Katia Lida Kermanidis
{"title":"Comparing a Hand-crafted to an Automatically Generated Feature Set for Deep Learning: Pairwise Translation Evaluation","authors":"Despoina Mouratidis, Katia Lida Kermanidis","doi":"10.26615/issn.2683-0078.2019_008","DOIUrl":null,"url":null,"abstract":"The automatic evaluation of machine translation (MT) has proven to be a very significant research topic. Most automatic evaluation methods focus on the evaluation of the output of MT as they compute similarity scores that represent translation quality. This work targets on the performance of MT evaluation. We present a general scheme for learning to classify parallel translations, using linguistic information, of two MT model outputs and one human (reference) translation. We present three experiments to this scheme using neural networks (NN). One using string based hand-crafted features (Exp1), the second using automatically trained embeddings from the reference and the two MT outputs (one from a statistical machine translation (SMT) model and the other from a neural ma-chine translation (NMT) model), which are learned using NN (Exp2), and the third experiment (Exp3) that combines information from the other two experiments. The languages involved are English (EN), Greek (GR) and Italian (IT) segments are educational in domain. The proposed language-independent learning scheme which combines information from the two experiments (experiment 3) achieves higher classification accuracy compared with models using BLEU score information as well as other classification approaches, such as Random Forest (RF) and Support Vector Machine (SVM).","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26615/issn.2683-0078.2019_008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The automatic evaluation of machine translation (MT) has proven to be a very significant research topic. Most automatic evaluation methods focus on the evaluation of the output of MT as they compute similarity scores that represent translation quality. This work targets on the performance of MT evaluation. We present a general scheme for learning to classify parallel translations, using linguistic information, of two MT model outputs and one human (reference) translation. We present three experiments to this scheme using neural networks (NN). One using string based hand-crafted features (Exp1), the second using automatically trained embeddings from the reference and the two MT outputs (one from a statistical machine translation (SMT) model and the other from a neural ma-chine translation (NMT) model), which are learned using NN (Exp2), and the third experiment (Exp3) that combines information from the other two experiments. The languages involved are English (EN), Greek (GR) and Italian (IT) segments are educational in domain. The proposed language-independent learning scheme which combines information from the two experiments (experiment 3) achieves higher classification accuracy compared with models using BLEU score information as well as other classification approaches, such as Random Forest (RF) and Support Vector Machine (SVM).
比较手工制作的深度学习特征集和自动生成的特征集:两两翻译评估
机器翻译的自动评价是一个非常重要的研究课题。大多数自动评价方法侧重于评价机器翻译的输出,因为它们计算的相似度分数代表了翻译质量。本研究的目标是机器翻译的性能评估。我们提出了一个学习分类平行翻译的一般方案,使用两个机器翻译模型输出和一个人工(参考)翻译的语言信息。我们用神经网络(NN)对该方案进行了三个实验。一个使用基于字符串的手工特征(Exp1),第二个使用参考和两个机器翻译输出(一个来自统计机器翻译(SMT)模型,另一个来自神经机器翻译(NMT)模型)的自动训练嵌入,使用神经网络(Exp2)学习,第三个实验(Exp3)结合了来自其他两个实验的信息。所涉及的语言是英语(EN),希腊语(GR)和意大利语(IT)部分是教育领域。与使用BLEU评分信息的模型以及随机森林(Random Forest, RF)和支持向量机(Support Vector Machine, SVM)等其他分类方法相比,本文提出的结合两个实验信息的语言独立学习方案(实验3)具有更高的分类精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信