{"title":"一种用于优化增量爬虫性能的数据挖掘方法","authors":"Hadrien Bullot, S. Gupta, M. Mohania","doi":"10.1109/WI.2003.1241279","DOIUrl":null,"url":null,"abstract":"Crawlers visit the Web to maintain a local repository of Web pages up to date. We introduce another perspective to build an effective incremental crawler. Based on previous work in this field, we study how we can improve the performance of a crawler using data-mining. The information collected from the users can help the crawler to know which are the popular pages and to revisit them as soon as possible.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"A data-mining approach for optimizing performance of an incremental crawler\",\"authors\":\"Hadrien Bullot, S. Gupta, M. Mohania\",\"doi\":\"10.1109/WI.2003.1241279\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crawlers visit the Web to maintain a local repository of Web pages up to date. We introduce another perspective to build an effective incremental crawler. Based on previous work in this field, we study how we can improve the performance of a crawler using data-mining. The information collected from the users can help the crawler to know which are the popular pages and to revisit them as soon as possible.\",\"PeriodicalId\":403574,\"journal\":{\"name\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2003.1241279\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2003.1241279","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A data-mining approach for optimizing performance of an incremental crawler
Crawlers visit the Web to maintain a local repository of Web pages up to date. We introduce another perspective to build an effective incremental crawler. Based on previous work in this field, we study how we can improve the performance of a crawler using data-mining. The information collected from the users can help the crawler to know which are the popular pages and to revisit them as soon as possible.