用XGBoost和概率混合模型模拟新旧Twitter用户活动

Frederick Mubang, Lawrence O. Hall
{"title":"用XGBoost和概率混合模型模拟新旧Twitter用户活动","authors":"Frederick Mubang, Lawrence O. Hall","doi":"10.1109/ICMLA55696.2022.00026","DOIUrl":null,"url":null,"abstract":"The Volume Audience Match Simulator is an end-to-end approach for predicting user-to-user interactions on a given social media platform. It is comprised of 2 components: firstly, an XGBoost-driven volume prediction module that predicts the number of: (1) total activities, (2) active old users, and (3) newly active users over the span of 24 hours from the start time of prediction. Secondly, VAM contains a User-Assignment Module that takes as input the volume predictions and predicts the user-to-user interactions of the old and new users.In previous work, VAM has been used to predict Twitter discussions related to political crises. In this work, VAM was used to predict future activity on Twitter related to international economic affairs. We include more experiments and analyses than previous work performed with VAM. In this work, VAM is used to predict all types of retweets, including quotes and replies, unlike previous work, which only focused on regular retweets. Furthermore, we show that YouTube features, in addition to Reddit features can improve prediction performance. We examine the importance of the time series features used in VAM’s Volume Prediction module. Lastly, we show that VAM’s performance is significantly more accurate than other approaches when predicting highly-skewed, lowly-skewed, highly-sparse, and lowly-sparse time series.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Simulating New and Old Twitter User Activity with XGBoost and Probabilistic Hybrid Models\",\"authors\":\"Frederick Mubang, Lawrence O. Hall\",\"doi\":\"10.1109/ICMLA55696.2022.00026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Volume Audience Match Simulator is an end-to-end approach for predicting user-to-user interactions on a given social media platform. It is comprised of 2 components: firstly, an XGBoost-driven volume prediction module that predicts the number of: (1) total activities, (2) active old users, and (3) newly active users over the span of 24 hours from the start time of prediction. Secondly, VAM contains a User-Assignment Module that takes as input the volume predictions and predicts the user-to-user interactions of the old and new users.In previous work, VAM has been used to predict Twitter discussions related to political crises. In this work, VAM was used to predict future activity on Twitter related to international economic affairs. We include more experiments and analyses than previous work performed with VAM. In this work, VAM is used to predict all types of retweets, including quotes and replies, unlike previous work, which only focused on regular retweets. Furthermore, we show that YouTube features, in addition to Reddit features can improve prediction performance. We examine the importance of the time series features used in VAM’s Volume Prediction module. Lastly, we show that VAM’s performance is significantly more accurate than other approaches when predicting highly-skewed, lowly-skewed, highly-sparse, and lowly-sparse time series.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

受众匹配模拟器是一种端到端方法,用于预测给定社交媒体平台上的用户对用户交互。它由两个部分组成:首先是xgboost驱动的容量预测模块,该模块预测从预测开始时间起24小时内的活动数量:(1)总活动数量,(2)活跃老用户数量,(3)新活跃用户数量。其次,VAM包含一个用户分配模块,该模块以预测量为输入,预测新老用户之间的用户交互。在之前的工作中,VAM已被用于预测与政治危机相关的Twitter讨论。在这项工作中,VAM被用来预测Twitter上与国际经济事务有关的未来活动。我们包括更多的实验和分析比以前的工作进行了VAM。在这项工作中,VAM用于预测所有类型的转发,包括引用和回复,而不是像以前的工作那样只关注常规转发。此外,我们表明YouTube的功能,除了Reddit的功能可以提高预测性能。我们研究了VAM体积预测模块中使用的时间序列特征的重要性。最后,我们证明了VAM在预测高偏、低偏、高稀疏和低稀疏时间序列时的性能明显比其他方法更准确。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Simulating New and Old Twitter User Activity with XGBoost and Probabilistic Hybrid Models
The Volume Audience Match Simulator is an end-to-end approach for predicting user-to-user interactions on a given social media platform. It is comprised of 2 components: firstly, an XGBoost-driven volume prediction module that predicts the number of: (1) total activities, (2) active old users, and (3) newly active users over the span of 24 hours from the start time of prediction. Secondly, VAM contains a User-Assignment Module that takes as input the volume predictions and predicts the user-to-user interactions of the old and new users.In previous work, VAM has been used to predict Twitter discussions related to political crises. In this work, VAM was used to predict future activity on Twitter related to international economic affairs. We include more experiments and analyses than previous work performed with VAM. In this work, VAM is used to predict all types of retweets, including quotes and replies, unlike previous work, which only focused on regular retweets. Furthermore, we show that YouTube features, in addition to Reddit features can improve prediction performance. We examine the importance of the time series features used in VAM’s Volume Prediction module. Lastly, we show that VAM’s performance is significantly more accurate than other approaches when predicting highly-skewed, lowly-skewed, highly-sparse, and lowly-sparse time series.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信