{"title":"Towards Combined Matrix and Tensor Factorization for Universal Schema Relation Extraction","authors":"Sameer Singh, Tim Rocktäschel, S. Riedel","doi":"10.3115/v1/W15-1519","DOIUrl":null,"url":null,"abstract":"Matrix factorization of knowledge bases in universal schema has facilitated accurate distantlysupervised relation extraction. This factorization encodes dependencies between textual patterns and structured relations using lowdimensional vectors defined for each entity pair; although these factors are effective at combining evidence for an entity pair, they are inaccurate on rare pairs, or for relations that depend crucially on the entity types. On the other hand, tensor factorization is able to overcome these shortcomings when applied to link prediction by maintaining entity-wise factors. However these models have been unsuitable for universal schema. In this paper we first present an illustration on synthetic data that explains the unsuitability of tensor factorization to relation extraction with universal schemas. Since the benefits of tensor and matrix factorization are complementary, we then investigate two hybrid methods that combine the benefits of the two paradigms. We show that the combination can be fruitful: we handle ambiguously phrased relations, achieve gains in accuracy on real-world relations, and demonstrate that entity embeddings encode entity types.","PeriodicalId":299646,"journal":{"name":"VS@HLT-NAACL","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"VS@HLT-NAACL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/v1/W15-1519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Matrix factorization of knowledge bases in universal schema has facilitated accurate distantlysupervised relation extraction. This factorization encodes dependencies between textual patterns and structured relations using lowdimensional vectors defined for each entity pair; although these factors are effective at combining evidence for an entity pair, they are inaccurate on rare pairs, or for relations that depend crucially on the entity types. On the other hand, tensor factorization is able to overcome these shortcomings when applied to link prediction by maintaining entity-wise factors. However these models have been unsuitable for universal schema. In this paper we first present an illustration on synthetic data that explains the unsuitability of tensor factorization to relation extraction with universal schemas. Since the benefits of tensor and matrix factorization are complementary, we then investigate two hybrid methods that combine the benefits of the two paradigms. We show that the combination can be fruitful: we handle ambiguously phrased relations, achieve gains in accuracy on real-world relations, and demonstrate that entity embeddings encode entity types.