An efficient Internet crawling and filtering system for the nationwide tendering information retrieval

Toshio Matsuda, Kazushige Nakamura, Norihiko Sakamoto
{"title":"An efficient Internet crawling and filtering system for the nationwide tendering information retrieval","authors":"Toshio Matsuda, Kazushige Nakamura, Norihiko Sakamoto","doi":"10.1109/WI.2003.1241304","DOIUrl":null,"url":null,"abstract":"With the growth of Internet, the central government and local governments have begun to publish matters concerning the prospect of orders for public works, the announcement of tendering and the contracting information on their Web sites. However, it is time consuming and painful for bidders such as constructors and manufacturers to periodically search the above information that matches their needs. Recently, there are various search engines, e.g. Google and Yahoo!, but those general search engines are not effective for the purpose of retrieving the above information quickly enough because of their crawling interval and coverage. Then we developed a system to automate the process of gathering such information, filtering for users' needs and delivering as the tendering and contracting information database. We describe the concept of the system as well as the key techniques to realize it: (1) to efficiently retrieve only relevant Web pages, and (2) filtering to match users' needs.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2003.1241304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

With the growth of Internet, the central government and local governments have begun to publish matters concerning the prospect of orders for public works, the announcement of tendering and the contracting information on their Web sites. However, it is time consuming and painful for bidders such as constructors and manufacturers to periodically search the above information that matches their needs. Recently, there are various search engines, e.g. Google and Yahoo!, but those general search engines are not effective for the purpose of retrieving the above information quickly enough because of their crawling interval and coverage. Then we developed a system to automate the process of gathering such information, filtering for users' needs and delivering as the tendering and contracting information database. We describe the concept of the system as well as the key techniques to realize it: (1) to efficiently retrieve only relevant Web pages, and (2) filtering to match users' needs.
面向全国招标信息检索的高效网络爬虫过滤系统
随着互联网的发展,中央政府和地方政府开始在其网站上公布有关公共工程订单前景、招标公告和承包信息。然而,对于承包商和制造商等投标人来说,定期搜索符合其需求的上述信息既耗时又痛苦。最近,有各种各样的搜索引擎,如Google和Yahoo!,但那些通用的搜索引擎由于其爬行间隔和覆盖范围,对于快速检索上述信息的目的并不有效。然后,我们开发了一个系统,可以自动收集这些信息,过滤用户的需求,并作为招标合同信息数据库交付。本文描述了该系统的概念和实现该系统的关键技术:(1)高效地检索相关网页;(2)过滤匹配用户需求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信