On the Accuracy of Hyper-local Geotagging of Social Media Content

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2014-09-04 DOI:10.1145/2684822.2685296

David Flatow, Mor Naaman, K. Xie, Yana Volkovich, Y. Kanza

引用次数: 67

Abstract

Social media users share billions of items per year, only a small fraction of which is geotagged. We present a data-driven approach for identifying non-geotagged content items that can be associated with a hyper-local geographic area by modeling the location distributions of n-grams that appear in the text. We explore the trade-off between accuracy and coverage of this method. Further, we explore differences across content received from multiple platforms and devices, and show, for example, that content shared via different sources and applications produces significantly different geographic distributions, and that it is preferred to model and predict location for items according to their source. Our findings show the potential and the bounds of a data-driven approach to assigning location data to short social media texts, and offer implications for all applications that use data-driven approaches to locate content.

查看原文本刊更多论文

社交媒体内容超局部地理标记的准确性研究

社交媒体用户每年分享数十亿件物品，其中只有一小部分带有地理标签。我们提出了一种数据驱动的方法，通过建模文本中出现的n-grams的位置分布，来识别非地理标记的内容项，这些内容项可以与超局部地理区域相关联。我们探讨了这种方法的准确性和覆盖率之间的权衡。此外，我们探讨了从多个平台和设备接收到的内容之间的差异，并显示，例如，通过不同来源和应用程序共享的内容产生了显著不同的地理分布，并且更倾向于根据其来源建模和预测项目的位置。我们的研究结果显示了将位置数据分配给短社交媒体文本的数据驱动方法的潜力和局限性，并为所有使用数据驱动方法来定位内容的应用程序提供了启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量