{"title":"社交网络爬虫& Twitter的结构分析","authors":"Atul Saroop, A. Karnik","doi":"10.1109/IMSAA.2011.6156368","DOIUrl":null,"url":null,"abstract":"Online social networks are growing at a rapid pace, both in terms of addition of new links between existing nodes and addition of new nodes to the network. Due to this continuous evolution of such networks, it is important to constantly crawl for information the overall network in general, and specific subnetworks in times of need. Precise information about social networks is important for devising strategies for improved dispersion of targeted information through the masses, for fine tuning messaging in marketing campaigns and for measuring the effectiveness of such marketing efforts. With the objective of gathering precise up-to-date information, we explore designs of fast crawlers for online social networks. Our experiments, carried on data downloaded from Twitter, show that node discovery strategies of random walk with backtrack and random search show promise as fast network crawlers. We implement the random search crawler for purposes of crawling Twitter for large amounts of information on network structure, user profile information and Tweet-level data. We present a summary of the data thus collected from Twitter. We also try to design generative models for Twitter-like networks that can be used in our simulations going forward, rather than having to depend upon downloading of large amounts of network information related data from online social networks.","PeriodicalId":445751,"journal":{"name":"2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Crawlers for social networks & structural analysis of Twitter\",\"authors\":\"Atul Saroop, A. Karnik\",\"doi\":\"10.1109/IMSAA.2011.6156368\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online social networks are growing at a rapid pace, both in terms of addition of new links between existing nodes and addition of new nodes to the network. Due to this continuous evolution of such networks, it is important to constantly crawl for information the overall network in general, and specific subnetworks in times of need. Precise information about social networks is important for devising strategies for improved dispersion of targeted information through the masses, for fine tuning messaging in marketing campaigns and for measuring the effectiveness of such marketing efforts. With the objective of gathering precise up-to-date information, we explore designs of fast crawlers for online social networks. Our experiments, carried on data downloaded from Twitter, show that node discovery strategies of random walk with backtrack and random search show promise as fast network crawlers. We implement the random search crawler for purposes of crawling Twitter for large amounts of information on network structure, user profile information and Tweet-level data. We present a summary of the data thus collected from Twitter. We also try to design generative models for Twitter-like networks that can be used in our simulations going forward, rather than having to depend upon downloading of large amounts of network information related data from online social networks.\",\"PeriodicalId\":445751,\"journal\":{\"name\":\"2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMSAA.2011.6156368\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMSAA.2011.6156368","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Crawlers for social networks & structural analysis of Twitter
Online social networks are growing at a rapid pace, both in terms of addition of new links between existing nodes and addition of new nodes to the network. Due to this continuous evolution of such networks, it is important to constantly crawl for information the overall network in general, and specific subnetworks in times of need. Precise information about social networks is important for devising strategies for improved dispersion of targeted information through the masses, for fine tuning messaging in marketing campaigns and for measuring the effectiveness of such marketing efforts. With the objective of gathering precise up-to-date information, we explore designs of fast crawlers for online social networks. Our experiments, carried on data downloaded from Twitter, show that node discovery strategies of random walk with backtrack and random search show promise as fast network crawlers. We implement the random search crawler for purposes of crawling Twitter for large amounts of information on network structure, user profile information and Tweet-level data. We present a summary of the data thus collected from Twitter. We also try to design generative models for Twitter-like networks that can be used in our simulations going forward, rather than having to depend upon downloading of large amounts of network information related data from online social networks.