网络爬行的深度强化学习

2021 Seventh Indian Control Conference (ICC) Pub Date : 2021-12-20 DOI:10.1109/ICC54714.2021.9703160

Konstantin Avrachenkov, V. Borkar, K. Patil

{"title":"网络爬行的深度强化学习","authors":"Konstantin Avrachenkov, V. Borkar, K. Patil","doi":"10.1109/ICC54714.2021.9703160","DOIUrl":null,"url":null,"abstract":"A search engine uses a web crawler to crawl the pages from the world wide web (WWW) and aims to maintain its local cache as fresh as possible. Unfortunately, the rates at which different pages change in WWW are highly nonuniform and also, unknown in many real-life scenarios. In addition, the finite available bandwidth and possible server restrictions on crawling frequency make it very difficult for the crawler to find the optimal scheduling policy that maximises the freshness of the local cache. We model this problem in a multi-armed restless bandits framework, where each arm represents a web page or an aggregate of statistically identical web pages. The objective is to find the scheduling policy that gives the exact indices of the pages to be crawled at a particular instance. We provide an online learning scheme using deep reinforcement learning (DRL) framework which learns the unknown page change dynamics on the fly along with the optimal crawling policy. Finally, we run numerical simulations to compare our approach with state-of-the-art algorithms such as static optimisation and Thompson sampling. We observe better performance for DRL.","PeriodicalId":382373,"journal":{"name":"2021 Seventh Indian Control Conference (ICC)","volume":"45 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Deep Reinforcement Learning for Web Crawling\",\"authors\":\"Konstantin Avrachenkov, V. Borkar, K. Patil\",\"doi\":\"10.1109/ICC54714.2021.9703160\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A search engine uses a web crawler to crawl the pages from the world wide web (WWW) and aims to maintain its local cache as fresh as possible. Unfortunately, the rates at which different pages change in WWW are highly nonuniform and also, unknown in many real-life scenarios. In addition, the finite available bandwidth and possible server restrictions on crawling frequency make it very difficult for the crawler to find the optimal scheduling policy that maximises the freshness of the local cache. We model this problem in a multi-armed restless bandits framework, where each arm represents a web page or an aggregate of statistically identical web pages. The objective is to find the scheduling policy that gives the exact indices of the pages to be crawled at a particular instance. We provide an online learning scheme using deep reinforcement learning (DRL) framework which learns the unknown page change dynamics on the fly along with the optimal crawling policy. Finally, we run numerical simulations to compare our approach with state-of-the-art algorithms such as static optimisation and Thompson sampling. We observe better performance for DRL.\",\"PeriodicalId\":382373,\"journal\":{\"name\":\"2021 Seventh Indian Control Conference (ICC)\",\"volume\":\"45 10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Seventh Indian Control Conference (ICC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICC54714.2021.9703160\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Seventh Indian Control Conference (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC54714.2021.9703160","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

搜索引擎使用网络爬虫从万维网(WWW)上抓取页面，目的是保持其本地缓存尽可能新鲜。不幸的是，WWW中不同页面的变化速率是高度不一致的，而且在许多现实场景中是未知的。此外，有限的可用带宽和可能的服务器对爬行频率的限制使得爬行程序很难找到最大化本地缓存新鲜度的最佳调度策略。我们在一个多臂不安分的强盗框架中对这个问题进行建模，其中每条臂代表一个网页或统计上相同的网页的集合。目标是找到调度策略，该策略给出在特定实例中要抓取的页面的确切索引。我们提供了一个使用深度强化学习(DRL)框架的在线学习方案，该框架可以动态学习未知的页面更改动态以及最佳爬行策略。最后，我们运行数值模拟，将我们的方法与最先进的算法(如静态优化和汤普森抽样)进行比较。我们观察到DRL的性能更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Reinforcement Learning for Web Crawling

A search engine uses a web crawler to crawl the pages from the world wide web (WWW) and aims to maintain its local cache as fresh as possible. Unfortunately, the rates at which different pages change in WWW are highly nonuniform and also, unknown in many real-life scenarios. In addition, the finite available bandwidth and possible server restrictions on crawling frequency make it very difficult for the crawler to find the optimal scheduling policy that maximises the freshness of the local cache. We model this problem in a multi-armed restless bandits framework, where each arm represents a web page or an aggregate of statistically identical web pages. The objective is to find the scheduling policy that gives the exact indices of the pages to be crawled at a particular instance. We provide an online learning scheme using deep reinforcement learning (DRL) framework which learns the unknown page change dynamics on the fly along with the optimal crawling policy. Finally, we run numerical simulations to compare our approach with state-of-the-art algorithms such as static optimisation and Thompson sampling. We observe better performance for DRL.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 Seventh Indian Control Conference (ICC)

自引率

0.00%

发文量