Event attendance prediction using social media

Ubaid Mehmood, I. Moser, Nicole Ronald
{"title":"Event attendance prediction using social media","authors":"Ubaid Mehmood, I. Moser, Nicole Ronald","doi":"10.1145/3373017.3373033","DOIUrl":null,"url":null,"abstract":"Predicting attendance at events a few hours in advance can be useful for organisers and road users alike. Several studies attempt to detect attendance at the time of the event from social media using geo-tagging or event-based social networks. In this study, we present a novel attendance classifier based on an LSTM and show that it outperforms other machine learning algorithms on two recent data sets with a few thousand attendees. The attendance prediction is based on the content of tweets alone, without the need for network or geospatial information. The pertinent analysis of the tweets requires text pre-processing, a sequence of steps that are implicit in the classification process and generally not discussed in other studies. We conducted a sensitivity analysis of text pre-processing steps and found that some steps like stemming and the removal of a custom list of stop words did nothing to improve the result, but the removal of mentions, punctuation and numbers proved very useful in terms of the results. The best-performing combination was identical for both data sets and led to a 6% improvement of the classification performance compared to the worst-performing combination.","PeriodicalId":297760,"journal":{"name":"Proceedings of the Australasian Computer Science Week Multiconference","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Australasian Computer Science Week Multiconference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373017.3373033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Predicting attendance at events a few hours in advance can be useful for organisers and road users alike. Several studies attempt to detect attendance at the time of the event from social media using geo-tagging or event-based social networks. In this study, we present a novel attendance classifier based on an LSTM and show that it outperforms other machine learning algorithms on two recent data sets with a few thousand attendees. The attendance prediction is based on the content of tweets alone, without the need for network or geospatial information. The pertinent analysis of the tweets requires text pre-processing, a sequence of steps that are implicit in the classification process and generally not discussed in other studies. We conducted a sensitivity analysis of text pre-processing steps and found that some steps like stemming and the removal of a custom list of stop words did nothing to improve the result, but the removal of mentions, punctuation and numbers proved very useful in terms of the results. The best-performing combination was identical for both data sets and led to a 6% improvement of the classification performance compared to the worst-performing combination.
利用社交媒体预测活动出席人数
提前几个小时预测参加活动的人数对组织者和道路使用者都很有用。一些研究试图利用地理标记或基于事件的社交网络从社交媒体上检测事件发生时的出席情况。在这项研究中,我们提出了一种基于LSTM的新型出勤分类器,并表明它在两个最近的数据集上表现优于其他机器学习算法。出席率预测仅基于tweet的内容,不需要网络或地理空间信息。对tweet的相关分析需要文本预处理,这是分类过程中隐含的一系列步骤,通常没有在其他研究中讨论。我们对文本预处理步骤进行了敏感性分析,发现一些步骤,如词干提取和删除自定义停止词列表,对改善结果没有任何作用,但删除提及、标点和数字被证明对结果非常有用。对于两个数据集,表现最好的组合是相同的,与表现最差的组合相比,分类性能提高了6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信