探索利用有限资源对机器翻译进行深度神经两两评价的有效性

2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA) Pub Date : 2021-07-12 DOI:10.1109/IISA52424.2021.9555556

Despoina Mouratidis, Katia Lida Kermanidis

{"title":"探索利用有限资源对机器翻译进行深度神经两两评价的有效性","authors":"Despoina Mouratidis, Katia Lida Kermanidis","doi":"10.1109/IISA52424.2021.9555556","DOIUrl":null,"url":null,"abstract":"In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.","PeriodicalId":437496,"journal":{"name":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the Effectiveness of Employing Limited Resources for Deep Neural Pairwise Evaluation of Machine Translation\",\"authors\":\"Despoina Mouratidis, Katia Lida Kermanidis\",\"doi\":\"10.1109/IISA52424.2021.9555556\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.\",\"PeriodicalId\":437496,\"journal\":{\"name\":\"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IISA52424.2021.9555556\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA52424.2021.9555556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种轻资源学习模式(light resource learning schema)，即依赖有限资源的模式，其目的是基于源段(source segment, SSE)信息和基于字符串的特征，在两个机器翻译输出中选择更好的翻译。向量的串联，包括从SSE、统计MT (SMT)和神经MT (NMT)段(分别为S1和S2)中数学计算的嵌入，被用作神经网络(NN)的输入。实验针对英语(EN) -希腊语(EL)语言对的两种不同形式的文本结构(一种是正式的、结构良好的语料库(C2)和一种是非正式的(C1))进行。本文提出了一种新的自动度量标准，即质量估计(QE)分数，以确定更好的翻译，而不是依赖于高级专家的注释。这个分数是基于从SSE和MT段派生的基于字符串的特征。实验结果表明，所提出的前馈神经网络具有相当好的性能，与需要更复杂资源的现有最先进的机器翻译评估模型相当。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploring the Effectiveness of Employing Limited Resources for Deep Neural Pairwise Evaluation of Machine Translation

In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)

自引率

0.00%

发文量