表格单元格搜索问题回答

Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, Xifeng Yan
{"title":"表格单元格搜索问题回答","authors":"Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, Xifeng Yan","doi":"10.1145/2872427.2883080","DOIUrl":null,"url":null,"abstract":"Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"98 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"109","resultStr":"{\"title\":\"Table Cell Search for Question Answering\",\"authors\":\"Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, Xifeng Yan\",\"doi\":\"10.1145/2872427.2883080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.\",\"PeriodicalId\":20455,\"journal\":{\"name\":\"Proceedings of the 25th International Conference on World Wide Web\",\"volume\":\"98 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"109\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th International Conference on World Wide Web\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2872427.2883080\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872427.2883080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 109

摘要

表格在网络上无处不在。信息性web表涵盖了各种各样的主题,自然可以作为满足用户信息需求的重要资源。在这种观察的驱使下,在本文中,我们研究了一个重要但在很大程度上没有得到解决的问题:给定数百万个表,如何精确地检索表单元格来回答用户的问题。这项工作提出了一个新的表单元格搜索框架来解决这个问题。我们首先形成关系链的概念,它连接表中的两个单元格,并表示它们之间的语义关系。在搜索引擎片段的帮助下,我们的框架生成一组指向可能正确答案单元格的关系链。我们进一步使用深度神经网络对哪个关系链最适合输入问题进行更细粒度的推理,并最终提取相应的答案单元。基于从Web上抓取的数百万个表,我们在开放域问答(QA)设置中评估了我们的框架,使用了众所周知的WebQuestions数据集和从Bing搜索引擎日志中挖掘的用户查询。在WebQuestions上,我们的框架可以与最先进的基于知识库(KBs)的QA系统相媲美,而在Bing查询上,它以56.7%的相对增益优于其他系统。此外,当与我们的框架的结果相结合时,基于知识库的QA性能可以获得28.1%到66.7%的相对改进,这表明web表提供了丰富的知识,这些知识可能在现有知识库中不存在或难以识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Table Cell Search for Question Answering
Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信