{"title":"Simulating New and Old Twitter User Activity with XGBoost and Probabilistic Hybrid Models","authors":"Frederick Mubang, Lawrence O. Hall","doi":"10.1109/ICMLA55696.2022.00026","DOIUrl":null,"url":null,"abstract":"The Volume Audience Match Simulator is an end-to-end approach for predicting user-to-user interactions on a given social media platform. It is comprised of 2 components: firstly, an XGBoost-driven volume prediction module that predicts the number of: (1) total activities, (2) active old users, and (3) newly active users over the span of 24 hours from the start time of prediction. Secondly, VAM contains a User-Assignment Module that takes as input the volume predictions and predicts the user-to-user interactions of the old and new users.In previous work, VAM has been used to predict Twitter discussions related to political crises. In this work, VAM was used to predict future activity on Twitter related to international economic affairs. We include more experiments and analyses than previous work performed with VAM. In this work, VAM is used to predict all types of retweets, including quotes and replies, unlike previous work, which only focused on regular retweets. Furthermore, we show that YouTube features, in addition to Reddit features can improve prediction performance. We examine the importance of the time series features used in VAM’s Volume Prediction module. Lastly, we show that VAM’s performance is significantly more accurate than other approaches when predicting highly-skewed, lowly-skewed, highly-sparse, and lowly-sparse time series.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Volume Audience Match Simulator is an end-to-end approach for predicting user-to-user interactions on a given social media platform. It is comprised of 2 components: firstly, an XGBoost-driven volume prediction module that predicts the number of: (1) total activities, (2) active old users, and (3) newly active users over the span of 24 hours from the start time of prediction. Secondly, VAM contains a User-Assignment Module that takes as input the volume predictions and predicts the user-to-user interactions of the old and new users.In previous work, VAM has been used to predict Twitter discussions related to political crises. In this work, VAM was used to predict future activity on Twitter related to international economic affairs. We include more experiments and analyses than previous work performed with VAM. In this work, VAM is used to predict all types of retweets, including quotes and replies, unlike previous work, which only focused on regular retweets. Furthermore, we show that YouTube features, in addition to Reddit features can improve prediction performance. We examine the importance of the time series features used in VAM’s Volume Prediction module. Lastly, we show that VAM’s performance is significantly more accurate than other approaches when predicting highly-skewed, lowly-skewed, highly-sparse, and lowly-sparse time series.