{"title":"SharpSpider:通过Web服务在Web上爬行","authors":"K. Moody, Marco A. Palomino","doi":"10.1109/LAWEB.2003.1250304","DOIUrl":null,"url":null,"abstract":"Web search engines have become an indispensable utility for Internet users. In the near future, however, Web search engines will not only be expected to provide quality search results, but also to enable applications to search and exploit their index repositories directly. We present here SharpSpider, a distributed, C# spider designed to address the issues of scalability, decentralisation and continuity of a Web crawl. Fundamental to the design of SharpSpider is the publication of an API for use by other services on the network. Such an API grants access to a constantly refreshed index buiU after successive crawls of the Web.","PeriodicalId":376743,"journal":{"name":"Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"SharpSpider: spidering the Web through Web services\",\"authors\":\"K. Moody, Marco A. Palomino\",\"doi\":\"10.1109/LAWEB.2003.1250304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web search engines have become an indispensable utility for Internet users. In the near future, however, Web search engines will not only be expected to provide quality search results, but also to enable applications to search and exploit their index repositories directly. We present here SharpSpider, a distributed, C# spider designed to address the issues of scalability, decentralisation and continuity of a Web crawl. Fundamental to the design of SharpSpider is the publication of an API for use by other services on the network. Such an API grants access to a constantly refreshed index buiU after successive crawls of the Web.\",\"PeriodicalId\":376743,\"journal\":{\"name\":\"Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LAWEB.2003.1250304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LAWEB.2003.1250304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SharpSpider: spidering the Web through Web services
Web search engines have become an indispensable utility for Internet users. In the near future, however, Web search engines will not only be expected to provide quality search results, but also to enable applications to search and exploit their index repositories directly. We present here SharpSpider, a distributed, C# spider designed to address the issues of scalability, decentralisation and continuity of a Web crawl. Fundamental to the design of SharpSpider is the publication of an API for use by other services on the network. Such an API grants access to a constantly refreshed index buiU after successive crawls of the Web.