基于活动的Twitter用户抽样,用于时间预测模型

S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi
{"title":"基于活动的Twitter用户抽样,用于时间预测模型","authors":"S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi","doi":"10.1109/BESC.2016.7804474","DOIUrl":null,"url":null,"abstract":"Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of \"credible\" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.","PeriodicalId":225942,"journal":{"name":"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Activity-based sampling of Twitter users for temporal prediction models\",\"authors\":\"S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi\",\"doi\":\"10.1109/BESC.2016.7804474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of \\\"credible\\\" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.\",\"PeriodicalId\":225942,\"journal\":{\"name\":\"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BESC.2016.7804474\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC.2016.7804474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

越来越多的应用程序依赖于来自社交媒体的众包数据。其中一些应用程序与实时数据流有关,而另一些应用程序则更侧重于从用户的历史时间轴中获取时间足迹。然而,确定“可信”用户的子集是至关重要的。虽然大多数抽样方法关注的是个人的静态网络,但通常不会考虑一段时间内的动态用户活动,这可能会导致收集到的数据出现活动空白。基于噪声和缺失数据的模型会显著降低性能。在本研究中,我们展示了如何对Twitter用户进行抽样,以便为时间预测模型提供更可信的数据。我们提出了一种基于活动的抽样方法,根据用户在Twitter中的历史活动来选择用户。在以用户为中心的时间模型中比较从基于活动的采样和随机采样收集的内容的可预测性。结果表明,面向活动的采样方法对于获取更可信的时间模型内容的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Activity-based sampling of Twitter users for temporal prediction models
Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of "credible" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信