Comparing a Hand-crafted to an Automatically Generated Feature Set for Deep Learning: Pairwise Translation Evaluation

Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019 Pub Date : 2019-10-30 DOI:10.26615/issn.2683-0078.2019_008

Despoina Mouratidis, Katia Lida Kermanidis

{"title":"Comparing a Hand-crafted to an Automatically Generated Feature Set for Deep Learning: Pairwise Translation Evaluation","authors":"Despoina Mouratidis, Katia Lida Kermanidis","doi":"10.26615/issn.2683-0078.2019_008","DOIUrl":null,"url":null,"abstract":"The automatic evaluation of machine translation (MT) has proven to be a very significant research topic. Most automatic evaluation methods focus on the evaluation of the output of MT as they compute similarity scores that represent translation quality. This work targets on the performance of MT evaluation. We present a general scheme for learning to classify parallel translations, using linguistic information, of two MT model outputs and one human (reference) translation. We present three experiments to this scheme using neural networks (NN). One using string based hand-crafted features (Exp1), the second using automatically trained embeddings from the reference and the two MT outputs (one from a statistical machine translation (SMT) model and the other from a neural ma-chine translation (NMT) model), which are learned using NN (Exp2), and the third experiment (Exp3) that combines information from the other two experiments. The languages involved are English (EN), Greek (GR) and Italian (IT) segments are educational in domain. The proposed language-independent learning scheme which combines information from the two experiments (experiment 3) achieves higher classification accuracy compared with models using BLEU score information as well as other classification approaches, such as Random Forest (RF) and Support Vector Machine (SVM).","PeriodicalId":313947,"journal":{"name":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26615/issn.2683-0078.2019_008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The automatic evaluation of machine translation (MT) has proven to be a very significant research topic. Most automatic evaluation methods focus on the evaluation of the output of MT as they compute similarity scores that represent translation quality. This work targets on the performance of MT evaluation. We present a general scheme for learning to classify parallel translations, using linguistic information, of two MT model outputs and one human (reference) translation. We present three experiments to this scheme using neural networks (NN). One using string based hand-crafted features (Exp1), the second using automatically trained embeddings from the reference and the two MT outputs (one from a statistical machine translation (SMT) model and the other from a neural ma-chine translation (NMT) model), which are learned using NN (Exp2), and the third experiment (Exp3) that combines information from the other two experiments. The languages involved are English (EN), Greek (GR) and Italian (IT) segments are educational in domain. The proposed language-independent learning scheme which combines information from the two experiments (experiment 3) achieves higher classification accuracy compared with models using BLEU score information as well as other classification approaches, such as Random Forest (RF) and Support Vector Machine (SVM).

查看原文本刊更多论文

比较手工制作的深度学习特征集和自动生成的特征集:两两翻译评估

机器翻译的自动评价是一个非常重要的研究课题。大多数自动评价方法侧重于评价机器翻译的输出，因为它们计算的相似度分数代表了翻译质量。本研究的目标是机器翻译的性能评估。我们提出了一个学习分类平行翻译的一般方案，使用两个机器翻译模型输出和一个人工(参考)翻译的语言信息。我们用神经网络(NN)对该方案进行了三个实验。一个使用基于字符串的手工特征(Exp1)，第二个使用参考和两个机器翻译输出(一个来自统计机器翻译(SMT)模型，另一个来自神经机器翻译(NMT)模型)的自动训练嵌入，使用神经网络(Exp2)学习，第三个实验(Exp3)结合了来自其他两个实验的信息。所涉及的语言是英语(EN)，希腊语(GR)和意大利语(IT)部分是教育领域。与使用BLEU评分信息的模型以及随机森林(Random Forest, RF)和支持向量机(Support Vector Machine, SVM)等其他分类方法相比，本文提出的结合两个实验信息的语言独立学习方案(实验3)具有更高的分类精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Second Workshop Human-Informed Translation and Interpreting Technology associated with RANLP 2019

自引率

0.00%

发文量