基于CNN和BiLSTM的推文地理位置预测

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Data Science and Engineering Pub Date : 2021-01-01 Epub Date: 2021-07-08 DOI:10.1007/s41019-021-00165-1
Rhea Mahajan, Vibhakar Mansotra
{"title":"基于CNN和BiLSTM的推文地理位置预测","authors":"Rhea Mahajan,&nbsp;Vibhakar Mansotra","doi":"10.1007/s41019-021-00165-1","DOIUrl":null,"url":null,"abstract":"<p><p>Twitter is one of the most popular micro-blogging and social networking platforms where users post their opinions, preferences, activities, thoughts, views, etc., in form of tweets within the limit of 280 characters. In order to study and analyse the social behavior and activities of a user across a region, it becomes necessary to identify the location of the tweet. This paper aims to predict geolocation of real-time tweets at the city level collected for a period of 30 days by using a combination of convolutional neural network and a bidirectional long short-term memory by extracting features within the tweets and features associated with the tweets. We have also compared our results with previous baseline models and the findings of our experiment show a significant improvement over baselines methods achieving an accuracy of 92.6 with a median error of 22.4 km at city level prediction.</p>","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":"6 4","pages":"402-410"},"PeriodicalIF":5.1000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s41019-021-00165-1","citationCount":"13","resultStr":"{\"title\":\"Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM.\",\"authors\":\"Rhea Mahajan,&nbsp;Vibhakar Mansotra\",\"doi\":\"10.1007/s41019-021-00165-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Twitter is one of the most popular micro-blogging and social networking platforms where users post their opinions, preferences, activities, thoughts, views, etc., in form of tweets within the limit of 280 characters. In order to study and analyse the social behavior and activities of a user across a region, it becomes necessary to identify the location of the tweet. This paper aims to predict geolocation of real-time tweets at the city level collected for a period of 30 days by using a combination of convolutional neural network and a bidirectional long short-term memory by extracting features within the tweets and features associated with the tweets. We have also compared our results with previous baseline models and the findings of our experiment show a significant improvement over baselines methods achieving an accuracy of 92.6 with a median error of 22.4 km at city level prediction.</p>\",\"PeriodicalId\":52220,\"journal\":{\"name\":\"Data Science and Engineering\",\"volume\":\"6 4\",\"pages\":\"402-410\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1007/s41019-021-00165-1\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41019-021-00165-1\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/7/8 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41019-021-00165-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/7/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 13

摘要

Twitter是最受欢迎的微博和社交网络平台之一,用户可以在280个字符以内以tweet的形式发布自己的观点、偏好、活动、想法、观点等。为了研究和分析一个地区用户的社会行为和活动,有必要确定推文的位置。本文旨在利用卷积神经网络和双向长短期记忆相结合的方法,通过提取推文内部特征和与推文相关的特征,对收集的30天的城市级实时推文进行地理定位预测。我们还将我们的结果与以前的基线模型进行了比较,我们的实验结果显示,在城市级预测中,基线方法的精度为92.6,中位误差为22.4公里,比基线方法有了显著的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM.

Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM.

Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM.

Predicting Geolocation of Tweets: Using Combination of CNN and BiLSTM.

Twitter is one of the most popular micro-blogging and social networking platforms where users post their opinions, preferences, activities, thoughts, views, etc., in form of tweets within the limit of 280 characters. In order to study and analyse the social behavior and activities of a user across a region, it becomes necessary to identify the location of the tweet. This paper aims to predict geolocation of real-time tweets at the city level collected for a period of 30 days by using a combination of convolutional neural network and a bidirectional long short-term memory by extracting features within the tweets and features associated with the tweets. We have also compared our results with previous baseline models and the findings of our experiment show a significant improvement over baselines methods achieving an accuracy of 92.6 with a median error of 22.4 km at city level prediction.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data Science and Engineering
Data Science and Engineering Engineering-Computational Mechanics
CiteScore
10.40
自引率
2.40%
发文量
26
审稿时长
12 weeks
期刊介绍: The journal of Data Science and Engineering (DSE) responds to the remarkable change in the focus of information technology development from CPU-intensive computation to data-intensive computation, where the effective application of data, especially big data, becomes vital. The emerging discipline data science and engineering, an interdisciplinary field integrating theories and methods from computer science, statistics, information science, and other fields, focuses on the foundations and engineering of efficient and effective techniques and systems for data collection and management, for data integration and correlation, for information and knowledge extraction from massive data sets, and for data use in different application domains. Focusing on the theoretical background and advanced engineering approaches, DSE aims to offer a prime forum for researchers, professionals, and industrial practitioners to share their knowledge in this rapidly growing area. It provides in-depth coverage of the latest advances in the closely related fields of data science and data engineering. More specifically, DSE covers four areas: (i) the data itself, i.e., the nature and quality of the data, especially big data; (ii) the principles of information extraction from data, especially big data; (iii) the theory behind data-intensive computing; and (iv) the techniques and systems used to analyze and manage big data. DSE welcomes papers that explore the above subjects. Specific topics include, but are not limited to: (a) the nature and quality of data, (b) the computational complexity of data-intensive computing,(c) new methods for the design and analysis of the algorithms for solving problems with big data input,(d) collection and integration of data collected from internet and sensing devises or sensor networks, (e) representation, modeling, and visualization of  big data,(f)  storage, transmission, and management of big data,(g) methods and algorithms of  data intensive computing, such asmining big data,online analysis processing of big data,big data-based machine learning, big data based decision-making, statistical computation of big data, graph-theoretic computation of big data, linear algebraic computation of big data, and  big data-based optimization. (h) hardware systems and software systems for data-intensive computing, (i) data security, privacy, and trust, and(j) novel applications of big data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信