利用随机非线性分数形装箱解决分布式网络爬行问题

A. Yazidi, B. Oommen, Ole-Christoffer Granmo, M. G. Olsen
{"title":"利用随机非线性分数形装箱解决分布式网络爬行问题","authors":"A. Yazidi, B. Oommen, Ole-Christoffer Granmo, M. G. Olsen","doi":"10.1109/CSE.2014.40","DOIUrl":null,"url":null,"abstract":"This paper deals with the extremely pertinent problem of web crawling, which is far from trivial considering the magnitude and all-pervasive nature of the World-Wide Web. While numerous AI tools can be used to deal with this task, in this paper we map the problem onto the combinatorially-hard stochastic non-linear fractional knapsack problem, which, in turn, is then solved using Learning Automata (LA). Such LA-based solutions have been recently shown to outperform previous state-of-the-art approaches to resource allocation in Web monitoring. However, the ever growing deployment of distributed systems raises the need for solutions that cope with a distributed setting. In this paper, we present a novel scheme for solving the non-linear fractional bin packing problem. Furthermore, we demonstrate that our scheme has applications to Web crawling, i.e., Distributed resource allocation, and in particular, to distributed Web monitoring. Comprehensive experimental results demonstrate the superiority of our scheme when compared to other classical approaches.","PeriodicalId":258990,"journal":{"name":"2014 IEEE 17th International Conference on Computational Science and Engineering","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Utilizing Stochastic Non-linear Fractional Bin Packing to Resolve Distributed Web Crawling\",\"authors\":\"A. Yazidi, B. Oommen, Ole-Christoffer Granmo, M. G. Olsen\",\"doi\":\"10.1109/CSE.2014.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper deals with the extremely pertinent problem of web crawling, which is far from trivial considering the magnitude and all-pervasive nature of the World-Wide Web. While numerous AI tools can be used to deal with this task, in this paper we map the problem onto the combinatorially-hard stochastic non-linear fractional knapsack problem, which, in turn, is then solved using Learning Automata (LA). Such LA-based solutions have been recently shown to outperform previous state-of-the-art approaches to resource allocation in Web monitoring. However, the ever growing deployment of distributed systems raises the need for solutions that cope with a distributed setting. In this paper, we present a novel scheme for solving the non-linear fractional bin packing problem. Furthermore, we demonstrate that our scheme has applications to Web crawling, i.e., Distributed resource allocation, and in particular, to distributed Web monitoring. Comprehensive experimental results demonstrate the superiority of our scheme when compared to other classical approaches.\",\"PeriodicalId\":258990,\"journal\":{\"name\":\"2014 IEEE 17th International Conference on Computational Science and Engineering\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 17th International Conference on Computational Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSE.2014.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 17th International Conference on Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSE.2014.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文研究的是一个非常相关的网络爬行问题,考虑到万维网的规模和无所不在的性质,这个问题远非微不足道。虽然许多人工智能工具可以用来处理这个任务,但在本文中,我们将这个问题映射到组合难随机非线性分数背包问题上,然后使用学习自动机(LA)来解决这个问题。这种基于la的解决方案最近被证明在Web监视中优于以前最先进的资源分配方法。然而,不断增长的分布式系统部署提出了应对分布式设置的解决方案的需求。本文提出了一种求解非线性分式装箱问题的新方案。此外,我们还证明了我们的方案适用于Web爬行,即分布式资源分配,特别是分布式Web监控。综合实验结果表明,与其他经典方法相比,该方案具有优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On Utilizing Stochastic Non-linear Fractional Bin Packing to Resolve Distributed Web Crawling
This paper deals with the extremely pertinent problem of web crawling, which is far from trivial considering the magnitude and all-pervasive nature of the World-Wide Web. While numerous AI tools can be used to deal with this task, in this paper we map the problem onto the combinatorially-hard stochastic non-linear fractional knapsack problem, which, in turn, is then solved using Learning Automata (LA). Such LA-based solutions have been recently shown to outperform previous state-of-the-art approaches to resource allocation in Web monitoring. However, the ever growing deployment of distributed systems raises the need for solutions that cope with a distributed setting. In this paper, we present a novel scheme for solving the non-linear fractional bin packing problem. Furthermore, we demonstrate that our scheme has applications to Web crawling, i.e., Distributed resource allocation, and in particular, to distributed Web monitoring. Comprehensive experimental results demonstrate the superiority of our scheme when compared to other classical approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信