{"title":"Exploring the Effectiveness of Employing Limited Resources for Deep Neural Pairwise Evaluation of Machine Translation","authors":"Despoina Mouratidis, Katia Lida Kermanidis","doi":"10.1109/IISA52424.2021.9555556","DOIUrl":null,"url":null,"abstract":"In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.","PeriodicalId":437496,"journal":{"name":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA52424.2021.9555556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a light resource learning schema, i.e. a schema that depends on limited resources, is introduced, which aims to choose the better translation between two machine translation (MT) outputs, based on information regarding the source segments (SSE) and string-based features. A concatenation of vectors, including mathematically calculated embeddings from the SSE, statistical MT (SMT) and neural MT (NMT) segments (S1 and S2 respectively), are used as input to a neural network (NN). Experiments are run for two different forms of text structure (one that is a formal, well-structured corpus (C2) and one that is informal (C1)) for the English (EN) – Greek (EL) language pair. Instead of relying on high-level experts’ annotations, a novel automatic metric is proposed for determining the better translation, namely the quality estimation (QE) score. This score is based on string-based features derived from both the SSE and the MT segments. Experimental results demonstrate a quite good performance for the proposed feed-forward NN, comparable to the existing state of-the-art models for MT evaluation that require more sophisticated resources.