QuAChIE: Question Answering based Chinese Information Extraction System

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval Pub Date : 2020-07-25 DOI:10.1145/3397271.3401411

Dongyu Ru, Zhenghui Wang, Lin Qiu, Hao Zhou, Lei Li, Weinan Zhang, Yong Yu

引用次数: 4

Abstract

In this paper, we present the design of QuAChIE, a Question Answering based Chinese Information Extraction system. QuAChIE mainly depends on a well-trained question answering model to extract high-quality triples. The group of head entity and relation are regarded as a question given the input text as the context. For the training and evaluation of each model in the system, we build a large-scale information extraction dataset using Wikidata and Wikipedia pages by distant supervision. The advanced models implemented on top of the pre-trained language model and the enormous distant supervision data enable QuAChIE to extract relation triples from documents with cross-sentence correlations. The experimental results on the test set and the case study based on the interactive demonstration show its satisfactory Information Extraction quality on Chinese document-level texts.

查看原文本刊更多论文

基于问答的中文信息抽取系统

本文提出了基于问答的中文信息抽取系统QuAChIE的设计。QuAChIE主要依靠训练有素的问答模型来提取高质量的三元组。在给定输入文本作为上下文的情况下，将标题实体和关系组视为一个问题。为了对系统中的每个模型进行训练和评估，我们通过远程监督，利用维基数据和维基百科页面构建了一个大规模的信息提取数据集。在预训练语言模型之上实现的高级模型和庞大的远程监督数据使QuAChIE能够从具有跨句相关性的文档中提取关系三元组。在测试集和基于交互演示的案例研究上的实验结果表明，该方法对中文文档级文本的信息提取质量令人满意。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

自引率

0.00%

发文量