Toponym Recognition in Social Media for Estimating the Location of Events

M. Sagcan, P. Senkul
{"title":"Toponym Recognition in Social Media for Estimating the Location of Events","authors":"M. Sagcan, P. Senkul","doi":"10.1109/ICDMW.2015.167","DOIUrl":null,"url":null,"abstract":"Prominence of social media such as Twitter and Facebook led to a huge collection of data over which event detection provides useful results. An important dimension of event detection is location estimation for detected events. Social media provides a variety of clues for location, such as geographical annotation from smart devices, location field in the user profile and the content of the message. Among these clues, message content needs more effort for processing, yet it is generally more informative. In this paper, we focus on extraction of location names, i.e., toponym recognition, from social media messages. We propose a a hybrid system, which uses both rule based and machine learning based techniques to extract toponyms from tweets. Conditional Random Fields (CRF) is used as the machine learning tool and features such as Part-of-Speech tags and conjunction window are defined in order to construct a CRF model for toponym recognition. In the rule based part, regular expressions are used in order to define some of the toponym recognition patterns as well as to provide a simple level of normalization in order to handle the informality in the text. Experimental results show that the proposed method has higher toponym recognition ratio in comparison to the previous studies.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2015.167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Prominence of social media such as Twitter and Facebook led to a huge collection of data over which event detection provides useful results. An important dimension of event detection is location estimation for detected events. Social media provides a variety of clues for location, such as geographical annotation from smart devices, location field in the user profile and the content of the message. Among these clues, message content needs more effort for processing, yet it is generally more informative. In this paper, we focus on extraction of location names, i.e., toponym recognition, from social media messages. We propose a a hybrid system, which uses both rule based and machine learning based techniques to extract toponyms from tweets. Conditional Random Fields (CRF) is used as the machine learning tool and features such as Part-of-Speech tags and conjunction window are defined in order to construct a CRF model for toponym recognition. In the rule based part, regular expressions are used in order to define some of the toponym recognition patterns as well as to provide a simple level of normalization in order to handle the informality in the text. Experimental results show that the proposed method has higher toponym recognition ratio in comparison to the previous studies.
基于社交媒体地名识别的事件位置估计
Twitter和Facebook等社交媒体的突出导致了大量数据的收集,事件检测可以提供有用的结果。事件检测的一个重要方面是对被检测事件的位置估计。社交媒体提供了各种各样的位置线索,例如智能设备的地理注释、用户个人资料中的位置字段和消息内容。在这些线索中,消息内容需要更多的精力来处理,但它通常更具信息性。在本文中,我们专注于从社交媒体消息中提取地点名称,即地名识别。我们提出了一个混合系统,它使用基于规则和基于机器学习的技术从推文中提取地名。将条件随机场(Conditional Random Fields, CRF)作为机器学习工具,定义词性标签和连接窗口等特征,构建用于地名识别的条件随机场模型。在基于规则的部分中,使用正则表达式来定义一些地名识别模式,并提供简单的规范化级别,以便处理文本中的非正式性。实验结果表明,该方法比以往的研究方法具有更高的地名识别率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信