{"title":"基于边缘推理混合图模型的改进远程监督关系提取","authors":"Shirong Shen, Shangfu Duan, Huan Gao, Guilin Qi","doi":"10.1016/j.websem.2021.100656","DOIUrl":null,"url":null,"abstract":"<div><p>Distant supervision relation extraction (DSRE) trains a classifier by automatically labeling data through aligning triples in the knowledge base (KB) with large-scale corpora. Training data generated by distant supervision may contain many mislabeled instances, which is harmful to the training of the classifier. Some recent methods show that relevant background information in KBs, such as entity type (e.g., Organization and Book), can improve the performance of DSRE. However, there are three main problems with these methods. Firstly, these methods are tailored for a specific type of information. A specific type of information only has a positive effect on a part of instances and will not be beneficial to all cases. Secondly, different background information is embedded independently, and no reasonable interaction is achieved. Thirdly, previous methods do not consider the side effect of the introduced noise of background information. To address these issues, we leverage five types of background information instead of a specific type of information in previous works and propose a novel edge-reasoning hybrid graph (ER-HG) model to realize reasonable interaction between different kinds of information. In addition, we further employ an attention mechanism<span> for the ER-HG model to alleviate the side effect of noise. The ER-HG model integrates all types of information efficiently and is very robust to the noise of information. We conduct experiments on two widely used datasets. The experimental results demonstrate that our model outperforms the state-of-the-art methods significantly in held-out metric and robustness tests.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100656","citationCount":"3","resultStr":"{\"title\":\"Improved distant supervision relation extraction based on edge-reasoning hybrid graph model\",\"authors\":\"Shirong Shen, Shangfu Duan, Huan Gao, Guilin Qi\",\"doi\":\"10.1016/j.websem.2021.100656\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Distant supervision relation extraction (DSRE) trains a classifier by automatically labeling data through aligning triples in the knowledge base (KB) with large-scale corpora. Training data generated by distant supervision may contain many mislabeled instances, which is harmful to the training of the classifier. Some recent methods show that relevant background information in KBs, such as entity type (e.g., Organization and Book), can improve the performance of DSRE. However, there are three main problems with these methods. Firstly, these methods are tailored for a specific type of information. A specific type of information only has a positive effect on a part of instances and will not be beneficial to all cases. Secondly, different background information is embedded independently, and no reasonable interaction is achieved. Thirdly, previous methods do not consider the side effect of the introduced noise of background information. To address these issues, we leverage five types of background information instead of a specific type of information in previous works and propose a novel edge-reasoning hybrid graph (ER-HG) model to realize reasonable interaction between different kinds of information. In addition, we further employ an attention mechanism<span> for the ER-HG model to alleviate the side effect of noise. The ER-HG model integrates all types of information efficiently and is very robust to the noise of information. We conduct experiments on two widely used datasets. The experimental results demonstrate that our model outperforms the state-of-the-art methods significantly in held-out metric and robustness tests.</span></p></div>\",\"PeriodicalId\":49951,\"journal\":{\"name\":\"Journal of Web Semantics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.websem.2021.100656\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Web Semantics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1570826821000317\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Semantics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826821000317","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Improved distant supervision relation extraction based on edge-reasoning hybrid graph model
Distant supervision relation extraction (DSRE) trains a classifier by automatically labeling data through aligning triples in the knowledge base (KB) with large-scale corpora. Training data generated by distant supervision may contain many mislabeled instances, which is harmful to the training of the classifier. Some recent methods show that relevant background information in KBs, such as entity type (e.g., Organization and Book), can improve the performance of DSRE. However, there are three main problems with these methods. Firstly, these methods are tailored for a specific type of information. A specific type of information only has a positive effect on a part of instances and will not be beneficial to all cases. Secondly, different background information is embedded independently, and no reasonable interaction is achieved. Thirdly, previous methods do not consider the side effect of the introduced noise of background information. To address these issues, we leverage five types of background information instead of a specific type of information in previous works and propose a novel edge-reasoning hybrid graph (ER-HG) model to realize reasonable interaction between different kinds of information. In addition, we further employ an attention mechanism for the ER-HG model to alleviate the side effect of noise. The ER-HG model integrates all types of information efficiently and is very robust to the noise of information. We conduct experiments on two widely used datasets. The experimental results demonstrate that our model outperforms the state-of-the-art methods significantly in held-out metric and robustness tests.
期刊介绍:
The Journal of Web Semantics is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. The journal emphasizes the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications.