使用社交媒体数据预测南非内乱:一种混合机器学习方法

IF 3 2区 社会学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Rejoice Chitengu, Silas Formunyuy Verkijika, Kelibone Eva Mamabolo
{"title":"使用社交媒体数据预测南非内乱:一种混合机器学习方法","authors":"Rejoice Chitengu, Silas Formunyuy Verkijika, Kelibone Eva Mamabolo","doi":"10.1177/08944393251349542","DOIUrl":null,"url":null,"abstract":"Civil unrest, encompassing protests and riots, is an increasing global concern, with incidents rising at an alarming rate, a trend that has been observed in South Africa over the years. This issue is particularly pronounced in today’s social media era, where platforms like ‘X’ (formerly Twitter) serve as powerful tools for mobilization. This raises the question: What factors drive civil unrest, and how can machine learning, using social media data, be employed to forecast such events? In response, this study had as objective to develop a hybrid machine learning model to forecast protest and riot events in South Africa using Twitter data. Employing the CRISP-DM methodology, data was collected from Twitter for the period between 2019 and 2024, resulting in 18,487 curated tweets, with associated ground truth data extracted from the ACLED database. Using this data, a hybrid model combining Bidirectional LSTM (Bi-LSTM) networks with eXtreme Gradient Boosting (XGBoost) for classification and regression tasks was developed to forecast civil unrest in South Africa. Additionally, SHapley Additive exPlanations (SHAP) were used for model explainability. The proposed model outperformed the base model, achieving an R-squared value of 33% for protests and 23% for riots in regression, along with classification accuracies of 92% for protests and 86.2% for riots. SHAP results indicated that the key predictors of unrest included sentiment-related features, tweet engagement features, regional factors, the day of the week, public holidays, and the topics being discussed. This study demonstrates the value of a hybrid model in forecasting civil unrest events and identifies key features that stakeholders can use to target their efforts more precisely in addressing civil unrest, ensuring resources are allocated where they are needed most. The study concludes with a discussion of valuable insights for stakeholders on how to leverage social media data to predict and mitigate civil unrest.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"60 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Forecasting Civil Unrest in South Africa Using Social Media Data: A Hybrid Machine Learning Approach\",\"authors\":\"Rejoice Chitengu, Silas Formunyuy Verkijika, Kelibone Eva Mamabolo\",\"doi\":\"10.1177/08944393251349542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Civil unrest, encompassing protests and riots, is an increasing global concern, with incidents rising at an alarming rate, a trend that has been observed in South Africa over the years. This issue is particularly pronounced in today’s social media era, where platforms like ‘X’ (formerly Twitter) serve as powerful tools for mobilization. This raises the question: What factors drive civil unrest, and how can machine learning, using social media data, be employed to forecast such events? In response, this study had as objective to develop a hybrid machine learning model to forecast protest and riot events in South Africa using Twitter data. Employing the CRISP-DM methodology, data was collected from Twitter for the period between 2019 and 2024, resulting in 18,487 curated tweets, with associated ground truth data extracted from the ACLED database. Using this data, a hybrid model combining Bidirectional LSTM (Bi-LSTM) networks with eXtreme Gradient Boosting (XGBoost) for classification and regression tasks was developed to forecast civil unrest in South Africa. Additionally, SHapley Additive exPlanations (SHAP) were used for model explainability. The proposed model outperformed the base model, achieving an R-squared value of 33% for protests and 23% for riots in regression, along with classification accuracies of 92% for protests and 86.2% for riots. SHAP results indicated that the key predictors of unrest included sentiment-related features, tweet engagement features, regional factors, the day of the week, public holidays, and the topics being discussed. This study demonstrates the value of a hybrid model in forecasting civil unrest events and identifies key features that stakeholders can use to target their efforts more precisely in addressing civil unrest, ensuring resources are allocated where they are needed most. The study concludes with a discussion of valuable insights for stakeholders on how to leverage social media data to predict and mitigate civil unrest.\",\"PeriodicalId\":49509,\"journal\":{\"name\":\"Social Science Computer Review\",\"volume\":\"60 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Social Science Computer Review\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/08944393251349542\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Computer Review","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/08944393251349542","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

包括抗议和骚乱在内的内乱日益成为全球关注的问题,事件以惊人的速度上升,多年来在南非也观察到这一趋势。这个问题在今天的社交媒体时代尤其明显,像“X”(以前的Twitter)这样的平台是动员的强大工具。这就提出了一个问题:是什么因素导致了内乱,以及如何利用社交媒体数据利用机器学习来预测此类事件?作为回应,本研究的目标是开发一种混合机器学习模型,利用Twitter数据预测南非的抗议和骚乱事件。采用CRISP-DM方法,从Twitter收集了2019年至2024年期间的数据,产生了18487条精选推文,并从ACLED数据库中提取了相关的真实数据。利用这些数据,开发了一个将双向LSTM (Bi-LSTM)网络与极端梯度增强(XGBoost)相结合的混合模型,用于分类和回归任务,以预测南非的内乱。此外,模型的可解释性采用SHapley加性解释(SHAP)。所提出的模型优于基本模型,在回归中,抗议的r平方值为33%,骚乱的r平方值为23%,抗议的分类准确率为92%,骚乱的分类准确率为86.2%。SHAP结果表明,不安的关键预测因素包括情绪相关特征、推特参与特征、地区因素、一周中的哪一天、公共假日和正在讨论的话题。本研究证明了混合模型在预测内乱事件方面的价值,并确定了利益相关者可以利用的关键特征,以便更准确地定位其应对内乱的努力,确保资源分配到最需要的地方。该研究最后讨论了利益相关者如何利用社交媒体数据预测和减轻内乱的宝贵见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Forecasting Civil Unrest in South Africa Using Social Media Data: A Hybrid Machine Learning Approach
Civil unrest, encompassing protests and riots, is an increasing global concern, with incidents rising at an alarming rate, a trend that has been observed in South Africa over the years. This issue is particularly pronounced in today’s social media era, where platforms like ‘X’ (formerly Twitter) serve as powerful tools for mobilization. This raises the question: What factors drive civil unrest, and how can machine learning, using social media data, be employed to forecast such events? In response, this study had as objective to develop a hybrid machine learning model to forecast protest and riot events in South Africa using Twitter data. Employing the CRISP-DM methodology, data was collected from Twitter for the period between 2019 and 2024, resulting in 18,487 curated tweets, with associated ground truth data extracted from the ACLED database. Using this data, a hybrid model combining Bidirectional LSTM (Bi-LSTM) networks with eXtreme Gradient Boosting (XGBoost) for classification and regression tasks was developed to forecast civil unrest in South Africa. Additionally, SHapley Additive exPlanations (SHAP) were used for model explainability. The proposed model outperformed the base model, achieving an R-squared value of 33% for protests and 23% for riots in regression, along with classification accuracies of 92% for protests and 86.2% for riots. SHAP results indicated that the key predictors of unrest included sentiment-related features, tweet engagement features, regional factors, the day of the week, public holidays, and the topics being discussed. This study demonstrates the value of a hybrid model in forecasting civil unrest events and identifies key features that stakeholders can use to target their efforts more precisely in addressing civil unrest, ensuring resources are allocated where they are needed most. The study concludes with a discussion of valuable insights for stakeholders on how to leverage social media data to predict and mitigate civil unrest.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Social Science Computer Review
Social Science Computer Review 社会科学-计算机:跨学科应用
CiteScore
9.00
自引率
4.90%
发文量
95
审稿时长
>12 weeks
期刊介绍: Unique Scope Social Science Computer Review is an interdisciplinary journal covering social science instructional and research applications of computing, as well as societal impacts of informational technology. Topics included: artificial intelligence, business, computational social science theory, computer-assisted survey research, computer-based qualitative analysis, computer simulation, economic modeling, electronic modeling, electronic publishing, geographic information systems, instrumentation and research tools, public administration, social impacts of computing and telecommunications, software evaluation, world-wide web resources for social scientists. Interdisciplinary Nature Because the Uses and impacts of computing are interdisciplinary, so is Social Science Computer Review. The journal is of direct relevance to scholars and scientists in a wide variety of disciplines. In its pages you''ll find work in the following areas: sociology, anthropology, political science, economics, psychology, computer literacy, computer applications, and methodology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信