一种基于道路地图的地理社会数据无参数抓取方法

Sou Ijima, Masaharu Hirota, Shohei Yokoyama
{"title":"一种基于道路地图的地理社会数据无参数抓取方法","authors":"Sou Ijima, Masaharu Hirota, Shohei Yokoyama","doi":"10.1145/3366030.3366094","DOIUrl":null,"url":null,"abstract":"Researchers must crawl geo-social data to analyze and visualize geo-social data. A conventional method to exhaustively crawl geosocial data is based on a grid. The crawler divides a specified area into a grid and uses the center coordinates of each cell to query databases using APIs. However, there is a difficult problem when using the grid-based method. It is that researchers cannot estimate the optimized grid size to exhaustively crawl geo-social data in advance because the optimized grid size depends on data density owing to geographical characteristics of an area. We focus on the fact that geo-social data are dense along roads. Thus, we propose a method based on road maps to exhaustively crawl geo-social data. We demonstrated that our method can crawl geo-social data by using almost the same number of queries compared to the crawler with an optimized grid size.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Crawling Method with No Parameters for Geo-social Data based on Road Maps\",\"authors\":\"Sou Ijima, Masaharu Hirota, Shohei Yokoyama\",\"doi\":\"10.1145/3366030.3366094\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Researchers must crawl geo-social data to analyze and visualize geo-social data. A conventional method to exhaustively crawl geosocial data is based on a grid. The crawler divides a specified area into a grid and uses the center coordinates of each cell to query databases using APIs. However, there is a difficult problem when using the grid-based method. It is that researchers cannot estimate the optimized grid size to exhaustively crawl geo-social data in advance because the optimized grid size depends on data density owing to geographical characteristics of an area. We focus on the fact that geo-social data are dense along roads. Thus, we propose a method based on road maps to exhaustively crawl geo-social data. We demonstrated that our method can crawl geo-social data by using almost the same number of queries compared to the crawler with an optimized grid size.\",\"PeriodicalId\":446280,\"journal\":{\"name\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3366030.3366094\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

研究人员必须抓取地理社会数据来分析和可视化地理社会数据。对地理社会数据进行详尽抓取的传统方法是基于网格的。爬虫将指定区域划分为网格,并使用每个网格的中心坐标使用api查询数据库。然而,在使用基于网格的方法时存在一个难题。由于一个地区的地理特征,优化的网格大小取决于数据密度,因此研究人员无法预先估计出最优的网格大小来详尽地抓取地理社会数据。我们关注的是地理社会数据在道路沿线密集的事实。因此,我们提出了一种基于路线图的方法来详尽地抓取地理社会数据。我们证明,与使用优化网格大小的爬虫相比,我们的方法可以使用几乎相同数量的查询来爬行地理社交数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Crawling Method with No Parameters for Geo-social Data based on Road Maps
Researchers must crawl geo-social data to analyze and visualize geo-social data. A conventional method to exhaustively crawl geosocial data is based on a grid. The crawler divides a specified area into a grid and uses the center coordinates of each cell to query databases using APIs. However, there is a difficult problem when using the grid-based method. It is that researchers cannot estimate the optimized grid size to exhaustively crawl geo-social data in advance because the optimized grid size depends on data density owing to geographical characteristics of an area. We focus on the fact that geo-social data are dense along roads. Thus, we propose a method based on road maps to exhaustively crawl geo-social data. We demonstrated that our method can crawl geo-social data by using almost the same number of queries compared to the crawler with an optimized grid size.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信