CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models

Yusheng Su , Xu Han , Zhengyan Zhang , Yankai Lin , Peng Li , Zhiyuan Liu , Jie Zhou , Maosong Sun
{"title":"CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models","authors":"Yusheng Su ,&nbsp;Xu Han ,&nbsp;Zhengyan Zhang ,&nbsp;Yankai Lin ,&nbsp;Peng Li ,&nbsp;Zhiyuan Liu ,&nbsp;Jie Zhou ,&nbsp;Maosong Sun","doi":"10.1016/j.aiopen.2021.06.004","DOIUrl":null,"url":null,"abstract":"<div><p>Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"2 ","pages":"Pages 127-134"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.aiopen.2021.06.004","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651021000188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs), and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs (“knowledge context”), regardless of that the knowledge required by PLMs may change dynamically according to specific text (“textual context”). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our implementation and datasets are publicly available.

CokeBERT:面向增强预训练语言模型的上下文知识选择和嵌入
近年来,通过在知识图(KGs)中使用额外的异构知识来增强预训练语言模型(PLMs),并在各种知识驱动的NLP任务上取得了一致的改进。然而,这些知识增强的plm大多嵌入了KGs的静态子图(“知识上下文”),而不考虑plm所需的知识可能会根据特定的文本(“文本上下文”)动态变化。在本文中,我们提出了一种新的框架Coke,用于plm根据文本上下文动态选择上下文知识并嵌入知识上下文,从而避免了KGs中冗余和模糊的知识与输入文本不匹配的影响。我们的实验结果表明,Coke在典型的知识驱动型NLP任务上的表现优于各种基线,表明利用动态知识上下文进行语言理解的有效性。除了性能改进之外,Coke中动态选择的知识可以以比传统plm更可解释的形式描述文本相关知识的语义。我们的实现和数据集是公开的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
45.00
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信