S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi
{"title":"基于活动的Twitter用户抽样,用于时间预测模型","authors":"S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi","doi":"10.1109/BESC.2016.7804474","DOIUrl":null,"url":null,"abstract":"Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of \"credible\" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.","PeriodicalId":225942,"journal":{"name":"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Activity-based sampling of Twitter users for temporal prediction models\",\"authors\":\"S. Aghababaei, E. Gultepe, Iuliia Chepurna, M. Makrehchi\",\"doi\":\"10.1109/BESC.2016.7804474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of \\\"credible\\\" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.\",\"PeriodicalId\":225942,\"journal\":{\"name\":\"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BESC.2016.7804474\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC.2016.7804474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Activity-based sampling of Twitter users for temporal prediction models
Increasingly more applications rely on crowd-sourced data from social media. Some of these applications are concerned with real-time data streams, while others are more focused on acquiring temporal footprints from historical timelines of users. Nevertheless, determining the subset of "credible" users is crucial. While the majority of sampling approaches focus on individuals' static networks, dynamic user activity over time is usually not considered, which may result in activity gaps in the collected data. Models based on noisy and missing data can significantly degrade in performance. In this study, we demonstrate how to sample Twitter users in order to produce more credible data for temporal prediction models. We present an activity-based sampling approach where users are selected based on their historical activities in Twitter. The predictability of the collected content from activity-based and random sampling is compared in a user-centric temporal model. The results indicate the importance of an activity-oriented sampling method for the acquisition of more credible content for temporal models.