NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model

2020 22nd International Conference on Advanced Communication Technology (ICACT) Pub Date : 2020-02-01 DOI:10.23919/ICACT48636.2020.9061292

Hao Yang, Ying Qin, Yao Deng, Minghan Wang

引用次数: 1

Abstract

Pre-trained language models like Bert, RoBERTa, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage

查看原文本刊更多论文

基于预训练语言模型的知识图挖掘的NMT增强

Bert、RoBERTa、GPT等预训练语言模型已经在多个NLP任务(如情感分类、信息提取、事件提取等)上实现了SOTA效果。提出了一种基于知识图的简单方法来提高机器翻译的质量。首先，我们提出了一个同时学习主语、宾语和谓语的多任务学习模型。其次，我们将不同的谓词视为不同的领域，并通过分类标签提高NMT模型在不同领域的识别能力。最后，波束搜索结合L2R、R2L通过实体对结果进行重新排列。基于CWMT2018实验数据，使用谓词的领域分类标识符，BLUE得分从33.58%提高到37.63%，通过L2R、R2L重排，BLEU得分提高到39.25%，整体提升幅度超过5个百分点

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 22nd International Conference on Advanced Communication Technology (ICACT)

自引率

0.00%

发文量