探索利用有限资源对机器翻译进行深度神经两两评价的有效性

Despoina Mouratidis, Katia Lida Kermanidis
{"title":"探索利用有限资源对机器翻译进行深度神经两两评价的有效性","authors":"Despoina Mouratidis, Katia Lida Kermanidis","doi":"10.1109/IISA52424.2021.9555556","DOIUrl":null,"url":null,"abstract":"In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.","PeriodicalId":437496,"journal":{"name":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the Effectiveness of Employing Limited Resources for Deep Neural Pairwise Evaluation of Machine Translation\",\"authors\":\"Despoina Mouratidis, Katia Lida Kermanidis\",\"doi\":\"10.1109/IISA52424.2021.9555556\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.\",\"PeriodicalId\":437496,\"journal\":{\"name\":\"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IISA52424.2021.9555556\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA52424.2021.9555556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种轻资源学习模式(light resource learning schema),即依赖有限资源的模式,其目的是基于源段(source segment, SSE)信息和基于字符串的特征,在两个机器翻译输出中选择更好的翻译。向量的串联,包括从SSE、统计MT (SMT)和神经MT (NMT)段(分别为S1和S2)中数学计算的嵌入,被用作神经网络(NN)的输入。实验针对英语(EN) -希腊语(EL)语言对的两种不同形式的文本结构(一种是正式的、结构良好的语料库(C2)和一种是非正式的(C1))进行。本文提出了一种新的自动度量标准,即质量估计(QE)分数,以确定更好的翻译,而不是依赖于高级专家的注释。这个分数是基于从SSE和MT段派生的基于字符串的特征。实验结果表明,所提出的前馈神经网络具有相当好的性能,与需要更复杂资源的现有最先进的机器翻译评估模型相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring the Effectiveness of Employing Limited Resources for Deep Neural Pairwise Evaluation of Machine Translation
In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信