Local-to-Global Structure-Aware Transformer for Question Answering over Structured Knowledge

IF 0.6 4区计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEICE Transactions on Information and Systems Pub Date : 2023-10-01 DOI:10.1587/transinf.2023edp7034

Yingyao WANG, Han WANG, Chaoqun DUAN, Tiejun ZHAO

{"title":"Local-to-Global Structure-Aware Transformer for Question Answering over Structured Knowledge","authors":"Yingyao WANG, Han WANG, Chaoqun DUAN, Tiejun ZHAO","doi":"10.1587/transinf.2023edp7034","DOIUrl":null,"url":null,"abstract":"Question-answering tasks over structured knowledge (i.e., tables and graphs) require the ability to encode structural information. Traditional pre-trained language models trained on linear-chain natural language cannot be directly applied to encode tables and graphs. The existing methods adopt the pre-trained models in such tasks by flattening structured knowledge into sequences. However, the serialization operation will lead to the loss of the structural information of knowledge. To better employ pre-trained transformers for structured knowledge representation, we propose a novel structure-aware transformer (SATrans) that injects the local-to-global structural information of the knowledge into the mask of the different self-attention layers. Specifically, in the lower self-attention layers, SATrans focus on the local structural information of each knowledge token to learn a more robust representation of it. In the upper self-attention layers, SATrans further injects the global information of the structured knowledge to integrate the information among knowledge tokens. In this way, the SATrans can effectively learn the semantic representation and structural information from the knowledge sequence and the attention mask, respectively. We evaluate SATrans on the table fact verification task and the knowledge base question-answering task. Furthermore, we explore two methods to combine symbolic and linguistic reasoning for these tasks to solve the problem that the pre-trained models lack symbolic reasoning ability. The experiment results reveal that the methods consistently outperform strong baselines on the two benchmarks.","PeriodicalId":55002,"journal":{"name":"IEICE Transactions on Information and Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEICE Transactions on Information and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1587/transinf.2023edp7034","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Question-answering tasks over structured knowledge (i.e., tables and graphs) require the ability to encode structural information. Traditional pre-trained language models trained on linear-chain natural language cannot be directly applied to encode tables and graphs. The existing methods adopt the pre-trained models in such tasks by flattening structured knowledge into sequences. However, the serialization operation will lead to the loss of the structural information of knowledge. To better employ pre-trained transformers for structured knowledge representation, we propose a novel structure-aware transformer (SATrans) that injects the local-to-global structural information of the knowledge into the mask of the different self-attention layers. Specifically, in the lower self-attention layers, SATrans focus on the local structural information of each knowledge token to learn a more robust representation of it. In the upper self-attention layers, SATrans further injects the global information of the structured knowledge to integrate the information among knowledge tokens. In this way, the SATrans can effectively learn the semantic representation and structural information from the knowledge sequence and the attention mask, respectively. We evaluate SATrans on the table fact verification task and the knowledge base question-answering task. Furthermore, we explore two methods to combine symbolic and linguistic reasoning for these tasks to solve the problem that the pre-trained models lack symbolic reasoning ability. The experiment results reveal that the methods consistently outperform strong baselines on the two benchmarks.

查看原文本刊更多论文

面向结构化知识问答的局部到全局结构感知转换器

结构化知识(即表格和图表)上的问答任务需要对结构化信息进行编码的能力。传统的基于线性链自然语言的预训练语言模型不能直接用于表和图的编码。现有方法通过将结构化知识扁平化为序列，采用预先训练好的模型。然而，序列化操作会导致知识结构信息的丢失。为了更好地利用预先训练好的变压器进行结构化知识表示，我们提出了一种新的结构感知变压器(satans)，它将知识的局部到全局结构信息注入到不同自关注层的掩膜中。具体来说，在较低的自关注层中，satans专注于每个知识令牌的局部结构信息，以学习它的更健壮的表示。在上层自关注层，satans进一步注入结构化知识的全局信息，实现知识标记间的信息集成。这样，satans可以有效地分别从知识序列和注意掩模中学习到语义表示和结构信息。我们在表格事实验证任务和知识库问答任务上对satans进行评估。此外，我们探索了两种将符号推理和语言推理相结合的方法来解决预训练模型缺乏符号推理能力的问题。实验结果表明，该方法在两个基准上的表现始终优于强基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEICE Transactions on Information and Systems 工程技术-计算机：软件工程

CiteScore

1.80

自引率

0.00%

发文量

238

审稿时长

5.0 months

期刊介绍： Published by The Institute of Electronics, Information and Communication Engineers Subject Area: Mathematics Physics Biology, Life Sciences and Basic Medicine General Medicine, Social Medicine, and Nursing Sciences Clinical Medicine Engineering in General Nanosciences and Materials Sciences Mechanical Engineering Electrical and Electronic Engineering Information Sciences Economics, Business & Management Psychology, Education.