Peng Nie, Yu-Chen Zheng, Ruixuan Wang, Chu-Qiao Chen, Jianxiong Dong, Jia-Hao Liu
{"title":"基于智能爬虫的信息订阅系统的设计与实现","authors":"Peng Nie, Yu-Chen Zheng, Ruixuan Wang, Chu-Qiao Chen, Jianxiong Dong, Jia-Hao Liu","doi":"10.1109/ITCA52113.2020.00159","DOIUrl":null,"url":null,"abstract":"With the continuous growth of data volume in the information age, it is becoming more and more difficult for people to obtain information they care about. The traditional way of manually collecting information from such massive data is inconvenient and inefficient. In order to solve this problem, we design an web information subscription system based on intelligent crawler. In this paper, the first section introduces the design of the system, which includes the two confirmations to determine the specific monitoring area, the original xpath-based information positioning and block method, the task management based on apscheduler and so on. The second section introduces the implementation of the system, which included the process of how to operate the system. The third hows the running interface and results of the system. And finally the experiments show that the system can help users quickly obtain useful web information.","PeriodicalId":103309,"journal":{"name":"2020 2nd International Conference on Information Technology and Computer Application (ITCA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design and Implementation of Information Subscription System Based on Intelligent Crawler\",\"authors\":\"Peng Nie, Yu-Chen Zheng, Ruixuan Wang, Chu-Qiao Chen, Jianxiong Dong, Jia-Hao Liu\",\"doi\":\"10.1109/ITCA52113.2020.00159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the continuous growth of data volume in the information age, it is becoming more and more difficult for people to obtain information they care about. The traditional way of manually collecting information from such massive data is inconvenient and inefficient. In order to solve this problem, we design an web information subscription system based on intelligent crawler. In this paper, the first section introduces the design of the system, which includes the two confirmations to determine the specific monitoring area, the original xpath-based information positioning and block method, the task management based on apscheduler and so on. The second section introduces the implementation of the system, which included the process of how to operate the system. The third hows the running interface and results of the system. And finally the experiments show that the system can help users quickly obtain useful web information.\",\"PeriodicalId\":103309,\"journal\":{\"name\":\"2020 2nd International Conference on Information Technology and Computer Application (ITCA)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Information Technology and Computer Application (ITCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITCA52113.2020.00159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Information Technology and Computer Application (ITCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCA52113.2020.00159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design and Implementation of Information Subscription System Based on Intelligent Crawler
With the continuous growth of data volume in the information age, it is becoming more and more difficult for people to obtain information they care about. The traditional way of manually collecting information from such massive data is inconvenient and inefficient. In order to solve this problem, we design an web information subscription system based on intelligent crawler. In this paper, the first section introduces the design of the system, which includes the two confirmations to determine the specific monitoring area, the original xpath-based information positioning and block method, the task management based on apscheduler and so on. The second section introduces the implementation of the system, which included the process of how to operate the system. The third hows the running interface and results of the system. And finally the experiments show that the system can help users quickly obtain useful web information.