{"title":"A Resource-Efficient Method for Crawling Swarm Information in Multiple BitTorrent Networks","authors":"Masahiro Yoshida, A. Nakao","doi":"10.1109/ISADS.2011.72","DOIUrl":null,"url":null,"abstract":"Bit Torrent is one of the most popular P2P file sharing applications in the world. Each Bit Torrent network is called a swarm and millions of peers may join multiple swarms. Due to swarm's large network size and complexity, many resources (PC servers, the Internet connection, etc.) are required for measuring all the swarms in the world. For this reason, the existing work is forced to measure only a part of the entire set of swarms, thus, ends up understanding only a part of it. In this paper, we propose a resource-efficient method for crawling multiple Bit Torrent swarms by only a limited amount of resources such as a single PC server. In the proposed method, our crawler avoids collecting redundant information of swarms without pressing WAN access links nor expending much processing resources. We also use a number of techniques to efficiently crawl all the participating peers of multiple swarms. We crawl over 4.3 million unique .torrent files, small files that store metadata used in Bit Torrent, and 48,000 tracker addresses. We can crawl 4.3 million swarms within an hour. We obtain 24 swarm snapshots and 10 million unique peers in a day.","PeriodicalId":221833,"journal":{"name":"2011 Tenth International Symposium on Autonomous Decentralized Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Tenth International Symposium on Autonomous Decentralized Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISADS.2011.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Bit Torrent is one of the most popular P2P file sharing applications in the world. Each Bit Torrent network is called a swarm and millions of peers may join multiple swarms. Due to swarm's large network size and complexity, many resources (PC servers, the Internet connection, etc.) are required for measuring all the swarms in the world. For this reason, the existing work is forced to measure only a part of the entire set of swarms, thus, ends up understanding only a part of it. In this paper, we propose a resource-efficient method for crawling multiple Bit Torrent swarms by only a limited amount of resources such as a single PC server. In the proposed method, our crawler avoids collecting redundant information of swarms without pressing WAN access links nor expending much processing resources. We also use a number of techniques to efficiently crawl all the participating peers of multiple swarms. We crawl over 4.3 million unique .torrent files, small files that store metadata used in Bit Torrent, and 48,000 tracker addresses. We can crawl 4.3 million swarms within an hour. We obtain 24 swarm snapshots and 10 million unique peers in a day.