Contextual Fact Ranking and Its Applications in Table Synthesis and Compression

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining Pub Date : 2019-07-25 DOI:10.1145/3292500.3330980

Silu Huang, Jialu Liu, Flip Korn, Xuezhi Wang, You Wu, Dale Markowitz, Cong Yu

{"title":"Contextual Fact Ranking and Its Applications in Table Synthesis and Compression","authors":"Silu Huang, Jialu Liu, Flip Korn, Xuezhi Wang, You Wu, Dale Markowitz, Cong Yu","doi":"10.1145/3292500.3330980","DOIUrl":null,"url":null,"abstract":"Modern search engines increasingly incorporate tabular content, which consists of a set of entities each augmented with a small set of facts. The facts can be obtained from multiple sources: an entity's knowledge base entry, the infobox on its Wikipedia page, or its row within a WebTable. Crucially, the informativeness of a fact depends not only on the entity but also the specific context(e.g., the query).To the best of our knowledge, this paper is the first to study the problem of contextual fact ranking: given some entities and a context (i.e., succinct natural language description), identify the most informative facts for the entities collectively within the context.We propose to contextually rank the facts by exploiting deep learning techniques. In particular, we develop pointwise and pair-wise ranking models, using textual and statistical information for the given entities and context derived from their sources. We enhance the models by incorporating entity type information from an IsA (hypernym) database. We demonstrate that our approaches achieve better performance than state-of-the-art baselines in terms of MAP, NDCG, and recall. We further conduct user studies for two specific applications of contextual fact ranking-table synthesis and table compression-and show that our models can identify more informative facts than the baselines.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3292500.3330980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Modern search engines increasingly incorporate tabular content, which consists of a set of entities each augmented with a small set of facts. The facts can be obtained from multiple sources: an entity's knowledge base entry, the infobox on its Wikipedia page, or its row within a WebTable. Crucially, the informativeness of a fact depends not only on the entity but also the specific context(e.g., the query).To the best of our knowledge, this paper is the first to study the problem of contextual fact ranking: given some entities and a context (i.e., succinct natural language description), identify the most informative facts for the entities collectively within the context.We propose to contextually rank the facts by exploiting deep learning techniques. In particular, we develop pointwise and pair-wise ranking models, using textual and statistical information for the given entities and context derived from their sources. We enhance the models by incorporating entity type information from an IsA (hypernym) database. We demonstrate that our approaches achieve better performance than state-of-the-art baselines in terms of MAP, NDCG, and recall. We further conduct user studies for two specific applications of contextual fact ranking-table synthesis and table compression-and show that our models can identify more informative facts than the baselines.

查看原文本刊更多论文

上下文事实排序及其在表合成和压缩中的应用

现代搜索引擎越来越多地结合表格内容，表格内容由一组实体组成，每个实体都有一小部分事实。事实可以从多个来源获得:实体的知识库条目、其Wikipedia页面上的信息框或其在WebTable中的行。至关重要的是，事实的信息量不仅取决于实体，还取决于特定的背景。查询)。据我们所知，本文是第一个研究上下文事实排序问题的论文:给定一些实体和一个上下文(即简洁的自然语言描述)，在上下文中为实体集体识别最具信息量的事实。我们建议利用深度学习技术对事实进行上下文排序。特别是，我们开发了点和成对排序模型，使用来自其来源的给定实体和上下文的文本和统计信息。我们通过合并来自IsA(缩略词)数据库的实体类型信息来增强模型。我们证明了我们的方法在MAP、NDCG和召回方面比最先进的基线实现了更好的性能。我们进一步对上下文事实排名表合成和表压缩的两个特定应用程序进行了用户研究，并表明我们的模型可以识别比基线更多的信息事实。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

自引率

0.00%

发文量