Querying Web data - the WebQA approach

Sunny Lam, M. Tamer Özsu
{"title":"Querying Web data - the WebQA approach","authors":"Sunny Lam, M. Tamer Özsu","doi":"10.1109/WISE.2002.1181651","DOIUrl":null,"url":null,"abstract":"The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, then browsing through the large number of returned URLs. This is significantly weaker than declarative querying that is supported by DBMSs. The lack of a schema and high volatility of the Web make \"database-like\" querying of Web data difficult. We report on our work in building a system, called WebQA, that provides a declarative query-based approach to Web data retrieval that uses question-answering technology in extracting information from Web sites that are retrieved by search engines. The approach consists of first using meta-search techniques in an open environment to gather candidate responses from search engines and other on-line databases, then using information extraction techniques to find the answer to a specific question from these candidates. A prototype system has been developed to test this approach. Testing includes evaluation of its performance as a question-answering system using a well-known evaluation system called TREC-9. Its accuracy using TREC-9 data for simple questions is high and its retrieval performance is good. The system employs an open system architecture allowing for on-going improvements.","PeriodicalId":392999,"journal":{"name":"Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISE.2002.1181651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The common paradigm of searching and retrieving information on the Web is based on keyword-based search using one or more search engines, then browsing through the large number of returned URLs. This is significantly weaker than declarative querying that is supported by DBMSs. The lack of a schema and high volatility of the Web make "database-like" querying of Web data difficult. We report on our work in building a system, called WebQA, that provides a declarative query-based approach to Web data retrieval that uses question-answering technology in extracting information from Web sites that are retrieved by search engines. The approach consists of first using meta-search techniques in an open environment to gather candidate responses from search engines and other on-line databases, then using information extraction techniques to find the answer to a specific question from these candidates. A prototype system has been developed to test this approach. Testing includes evaluation of its performance as a question-answering system using a well-known evaluation system called TREC-9. Its accuracy using TREC-9 data for simple questions is high and its retrieval performance is good. The system employs an open system architecture allowing for on-going improvements.
查询Web数据——WebQA方法
在Web上搜索和检索信息的常见范例是基于使用一个或多个搜索引擎的基于关键字的搜索,然后浏览返回的大量url。这明显弱于dbms支持的声明性查询。缺乏模式和Web的高波动性使得“类似数据库”的Web数据查询变得困难。我们将报告我们在构建一个称为WebQA的系统方面的工作,该系统为Web数据检索提供了一种声明式的基于查询的方法,该方法使用问答技术从搜索引擎检索的Web站点提取信息。该方法包括首先在开放环境中使用元搜索技术从搜索引擎和其他在线数据库中收集候选人的回答,然后使用信息提取技术从这些候选人中找到特定问题的答案。已经开发了一个原型系统来测试这种方法。测试包括使用著名的评估系统TREC-9对其作为问答系统的性能进行评估。利用TREC-9数据对简单问题进行检索,准确率高,检索性能好。该系统采用开放的系统架构,允许持续改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信