通过Bandit算法收集非地理标记的本地tweet

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management Pub Date : 2017-11-06 DOI:10.1145/3132847.3133046

Saki Ueda, Yuto Yamaguchi, H. Kitagawa

{"title":"通过Bandit算法收集非地理标记的本地tweet","authors":"Saki Ueda, Yuto Yamaguchi, H. Kitagawa","doi":"10.1145/3132847.3133046","DOIUrl":null,"url":null,"abstract":"How can we collect non-geotagged tweets posted by users in a specific location as many as possible in a limited time span? How can we find such users if we do not have much information about the specified location? Although there are varieties of methods to estimate the locations of users, these methods are not directly applicable to this problem because they require collecting a large amount of random tweets and then filter them to obtain a small amount of tweets from such users. In this paper, we propose a framework that incrementally finds such users and continuously collects tweets from them. Our framework is based on the bandit algorithm that adjusts the trade-off between exploration and exploitation, in other words, it simultaneously finds new users in the specified location and collects tweets from already-found users. The experimental results show that the bandit algorithm works well on this problem and outperforms the carefully-designed baselines.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Collecting Non-Geotagged Local Tweets via Bandit Algorithms\",\"authors\":\"Saki Ueda, Yuto Yamaguchi, H. Kitagawa\",\"doi\":\"10.1145/3132847.3133046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How can we collect non-geotagged tweets posted by users in a specific location as many as possible in a limited time span? How can we find such users if we do not have much information about the specified location? Although there are varieties of methods to estimate the locations of users, these methods are not directly applicable to this problem because they require collecting a large amount of random tweets and then filter them to obtain a small amount of tweets from such users. In this paper, we propose a framework that incrementally finds such users and continuously collects tweets from them. Our framework is based on the bandit algorithm that adjusts the trade-off between exploration and exploitation, in other words, it simultaneously finds new users in the specified location and collects tweets from already-found users. The experimental results show that the bandit algorithm works well on this problem and outperforms the carefully-designed baselines.\",\"PeriodicalId\":20449,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"volume\":\"36 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3132847.3133046\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3133046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们如何在有限的时间内尽可能多地收集用户在特定地点发布的非地理标记推文?如果我们没有太多关于指定位置的信息，我们如何找到这样的用户?虽然有各种各样的方法来估计用户的位置，但这些方法并不能直接适用于这个问题，因为它们需要收集大量的随机tweets，然后对它们进行过滤，从而从这些用户中获得少量的tweets。在本文中，我们提出了一个框架，可以增量地找到这样的用户，并不断地从他们那里收集推文。我们的框架基于强盗算法，该算法调整了探索和利用之间的权衡，换句话说，它同时在指定位置发现新用户并从已经发现的用户收集tweet。实验结果表明，bandit算法可以很好地解决这个问题，并且优于精心设计的基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Collecting Non-Geotagged Local Tweets via Bandit Algorithms

How can we collect non-geotagged tweets posted by users in a specific location as many as possible in a limited time span? How can we find such users if we do not have much information about the specified location? Although there are varieties of methods to estimate the locations of users, these methods are not directly applicable to this problem because they require collecting a large amount of random tweets and then filter them to obtain a small amount of tweets from such users. In this paper, we propose a framework that incrementally finds such users and continuously collects tweets from them. Our framework is based on the bandit algorithm that adjusts the trade-off between exploration and exploitation, in other words, it simultaneously finds new users in the specified location and collects tweets from already-found users. The experimental results show that the bandit algorithm works well on this problem and outperforms the carefully-designed baselines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

自引率

0.00%

发文量