Semi-supervised Auto-encoder Based Event Detection in Constructing Knowledge Graph for Social Good

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2019-10-01 DOI:10.1145/3350546.3360736

Yue Zhao, Xiaolong Jin, Yuanzhuo Wang, Xueqi Cheng

{"title":"Semi-supervised Auto-encoder Based Event Detection in Constructing Knowledge Graph for Social Good","authors":"Yue Zhao, Xiaolong Jin, Yuanzhuo Wang, Xueqi Cheng","doi":"10.1145/3350546.3360736","DOIUrl":null,"url":null,"abstract":"Knowledge graphs have recently been extensively applied in many different areas (e.g., disaster management and relief, disease diagnosis). For example, event-centric knowledge graphs have been developed to improve decision making in disaster management and relief. This paper focuses on the task of event detection, which is the precondition of event extraction for constructing event-centric knowledge graphs. Event detection identifies trigger words of events in the sentences of a document and further classifies the types of events. It is straightforward that context information is useful for event detection. Therefore, the feature-based methods adopt crosssentence information. However, they suffer from the complication of human-designed features. On the other hand, the representationbased methods learn document-level embeddings, which, however, contain much noise caused by unsupervised learning. To overcome these problems, in this paper we propose a new model based on Semi-supervised Auto-Encoder, which learns Context information to Enhance Event Detection, thus called SAE-CEED. This model first applies large-scale unlabeled texts to pre-train an auto-encoder, so that the embeddings of segments learned by the encoder contain the semantic and order information of the original text. It then uses the decoder to extract the context embeddings and fine-tunes them to enhance a bidirectional neural network model to identify event triggers and their types in sentences. Through experiments on the benchmark ACE-2005 dataset, we demonstrate the effectiveness of the proposed SAE-CEED model. In addition, we systematically conduct a series of experiments to verify the impact of different lengths of text segments in the pre-training of the auto-encoder on event detection.","PeriodicalId":171168,"journal":{"name":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3350546.3360736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Knowledge graphs have recently been extensively applied in many different areas (e.g., disaster management and relief, disease diagnosis). For example, event-centric knowledge graphs have been developed to improve decision making in disaster management and relief. This paper focuses on the task of event detection, which is the precondition of event extraction for constructing event-centric knowledge graphs. Event detection identifies trigger words of events in the sentences of a document and further classifies the types of events. It is straightforward that context information is useful for event detection. Therefore, the feature-based methods adopt crosssentence information. However, they suffer from the complication of human-designed features. On the other hand, the representationbased methods learn document-level embeddings, which, however, contain much noise caused by unsupervised learning. To overcome these problems, in this paper we propose a new model based on Semi-supervised Auto-Encoder, which learns Context information to Enhance Event Detection, thus called SAE-CEED. This model first applies large-scale unlabeled texts to pre-train an auto-encoder, so that the embeddings of segments learned by the encoder contain the semantic and order information of the original text. It then uses the decoder to extract the context embeddings and fine-tunes them to enhance a bidirectional neural network model to identify event triggers and their types in sentences. Through experiments on the benchmark ACE-2005 dataset, we demonstrate the effectiveness of the proposed SAE-CEED model. In addition, we systematically conduct a series of experiments to verify the impact of different lengths of text segments in the pre-training of the auto-encoder on event detection.

查看原文本刊更多论文

基于半监督自编码器的社会公益知识图谱事件检测

知识图谱最近被广泛应用于许多不同的领域(例如，灾害管理和救济、疾病诊断)。例如，已经开发了以事件为中心的知识图，以改进灾害管理和救援中的决策制定。事件检测是构建以事件为中心的知识图进行事件提取的前提。事件检测识别文档句子中事件的触发词，并进一步对事件类型进行分类。很明显，上下文信息对于事件检测很有用。因此，基于特征的方法采用交叉句信息。然而，它们受到人为设计功能的复杂性的影响。另一方面，基于表示的方法学习文档级嵌入，然而，文档级嵌入包含大量由无监督学习引起的噪声。为了克服这些问题，本文提出了一种基于半监督自编码器的新模型，该模型通过学习上下文信息来增强事件检测，称为SAE-CEED。该模型首先应用大规模未标记文本对自编码器进行预训练，使编码器学习到的片段嵌入包含原始文本的语义和顺序信息。然后，它使用解码器提取上下文嵌入并对其进行微调，以增强双向神经网络模型，以识别句子中的事件触发器及其类型。通过在ACE-2005数据集上的实验，我们验证了所提出的SAE-CEED模型的有效性。此外，我们系统地进行了一系列实验，验证了自编码器预训练中不同长度的文本片段对事件检测的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

自引率

0.00%

发文量