Predicting refugee flows from Ukraine with an approach to Big (Crisis) Data: a new opportunity for refugee and humanitarian studies

T. Jurić
{"title":"Predicting refugee flows from Ukraine with an approach to Big (Crisis) Data: a new opportunity for refugee and humanitarian studies","authors":"T. Jurić","doi":"10.1101/2022.03.15.22272428","DOIUrl":null,"url":null,"abstract":"Background: This paper shows that Big Data and the so-called tools of digital demography, such as Google Trends (GT) and insights from social networks such as Instagram, Twitter and Facebook, can be useful for determining, estimating, and predicting the forced migration flows to the EU caused by the war in Ukraine. Objective: The objective of this study was to test the usefulness of Google Trends indexes to predict further forced migration from Ukraine to the EU (mainly to Germany) and gain demographic insights from social networks into the age and gender structure of refugees. Methods: The primary methodological concept of our approach is to monitor the digital trace of Internet searches in Ukrainian, Russian and English with the Google Trends analytical tool (trends.google.com). Initially, keywords were chosen that are most predictive, specific, and common enough to predict the forced migration from Ukraine. We requested the data before and during the war outbreak and divided the keyword frequency for each migration-related query to standardise the data. We compared this search frequency index with official statistics from UNHCR to prove the significations of results and correlations and test the models predictive potential. Since UNHCR does not yet have complete data on the demographic structure of refugees, to fill this gap, we used three other alternative Big Data sources: Facebook, Twitter and Instagram. Results: All tested migration-related search queries about emigration planning from Ukraine show the positive linear association between Google index and data from official UNHCR statistics; R2 = 0.1211 for searches in Russian and R2 = 0.1831 for searches in Ukrainian. It is noticed that Ukrainians use the Russian language more often to search for terms than Ukrainian. Increase in migration-related search activities in Ukraine such as [gcy][p][a][ncy][icy][tscy][a] (Rus. border), [kcy][o][p][dcy][o][ncy][u] (Ukr. border); [Pcy][o][lcy][softcy][shchcy][a] (Poland); [Gcy][e][p][m][a][ncy][icy][yacy] (Rus. Germany), [H]i[m][e][chcy][chcy][icy][ncy][a] (Ukr. Germany) and [U][gcy][o][p][shchcy][icy][ncy][a] and [V][e][ncy][gcy][p][icy][yacy] (Hungary) correlate strongly with officially UNHCR data for externally displaced persons from Ukraine. All three languages show that the interest in Poland is the highest. When refugees arrive in nearby countries, the search for terms related to Germany, such as crossing the border + Germany, etc., is proliferating. This result confirms our hypothesis that one-third of all refugees will cross into Germany. According to Big Data insights, the estimate of the total number of expected refugees is to expect 5,4 Million refugees. The age group most represented is between 24 and 45 years (data for children are unavailable), and over 65% are women. Conclusion: The increase in migration-related search queries is correlated with the rise in the number of refugees from Ukraine in the EU. Thus this method allows reliable forecasts. Understanding the consequences of forced migration from Ukraine is crucial to enabling UNHCR and governments to develop optimal humanitarian strategies and prepare for refugee reception and possible integration. The benefit of this method is reliable estimates and forecasting that can allow governments and UNHCR to prepare and better respond to the recent humanitarian crisis.","PeriodicalId":197899,"journal":{"name":"Athens Journal of Τechnology & Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Athens Journal of Τechnology & Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2022.03.15.22272428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Background: This paper shows that Big Data and the so-called tools of digital demography, such as Google Trends (GT) and insights from social networks such as Instagram, Twitter and Facebook, can be useful for determining, estimating, and predicting the forced migration flows to the EU caused by the war in Ukraine. Objective: The objective of this study was to test the usefulness of Google Trends indexes to predict further forced migration from Ukraine to the EU (mainly to Germany) and gain demographic insights from social networks into the age and gender structure of refugees. Methods: The primary methodological concept of our approach is to monitor the digital trace of Internet searches in Ukrainian, Russian and English with the Google Trends analytical tool (trends.google.com). Initially, keywords were chosen that are most predictive, specific, and common enough to predict the forced migration from Ukraine. We requested the data before and during the war outbreak and divided the keyword frequency for each migration-related query to standardise the data. We compared this search frequency index with official statistics from UNHCR to prove the significations of results and correlations and test the models predictive potential. Since UNHCR does not yet have complete data on the demographic structure of refugees, to fill this gap, we used three other alternative Big Data sources: Facebook, Twitter and Instagram. Results: All tested migration-related search queries about emigration planning from Ukraine show the positive linear association between Google index and data from official UNHCR statistics; R2 = 0.1211 for searches in Russian and R2 = 0.1831 for searches in Ukrainian. It is noticed that Ukrainians use the Russian language more often to search for terms than Ukrainian. Increase in migration-related search activities in Ukraine such as [gcy][p][a][ncy][icy][tscy][a] (Rus. border), [kcy][o][p][dcy][o][ncy][u] (Ukr. border); [Pcy][o][lcy][softcy][shchcy][a] (Poland); [Gcy][e][p][m][a][ncy][icy][yacy] (Rus. Germany), [H]i[m][e][chcy][chcy][icy][ncy][a] (Ukr. Germany) and [U][gcy][o][p][shchcy][icy][ncy][a] and [V][e][ncy][gcy][p][icy][yacy] (Hungary) correlate strongly with officially UNHCR data for externally displaced persons from Ukraine. All three languages show that the interest in Poland is the highest. When refugees arrive in nearby countries, the search for terms related to Germany, such as crossing the border + Germany, etc., is proliferating. This result confirms our hypothesis that one-third of all refugees will cross into Germany. According to Big Data insights, the estimate of the total number of expected refugees is to expect 5,4 Million refugees. The age group most represented is between 24 and 45 years (data for children are unavailable), and over 65% are women. Conclusion: The increase in migration-related search queries is correlated with the rise in the number of refugees from Ukraine in the EU. Thus this method allows reliable forecasts. Understanding the consequences of forced migration from Ukraine is crucial to enabling UNHCR and governments to develop optimal humanitarian strategies and prepare for refugee reception and possible integration. The benefit of this method is reliable estimates and forecasting that can allow governments and UNHCR to prepare and better respond to the recent humanitarian crisis.
用大(危机)数据方法预测来自乌克兰的难民潮:难民和人道主义研究的新机遇
背景:本文表明,大数据和所谓的数字人口统计工具,如谷歌趋势(GT),以及来自Instagram、Twitter和Facebook等社交网络的见解,可以用于确定、估计和预测乌克兰战争导致的被迫移民流向欧盟。目的:本研究的目的是测试谷歌趋势指数的有用性,以预测从乌克兰到欧盟(主要是德国)的进一步被迫移民,并从社交网络中获得难民年龄和性别结构的人口统计学见解。方法:我们方法的主要方法论概念是用谷歌趋势分析工具(trends.google.com)监测乌克兰语、俄语和英语互联网搜索的数字痕迹。最初,选择了最具预测性、最具体、最常见的关键字来预测乌克兰的被迫迁移。我们请求了战争爆发前和爆发期间的数据,并划分了每个迁移相关查询的关键字频率,以标准化数据。我们将这个搜索频率指数与联合国难民署的官方统计数据进行比较,以证明结果和相关性的意义,并测试模型的预测潜力。由于联合国难民署还没有关于难民人口结构的完整数据,为了填补这一空白,我们使用了另外三个大数据来源:Facebook、Twitter和Instagram。结果:所有经过测试的与乌克兰移民计划相关的搜索查询显示,谷歌指数与联合国难民署官方统计数据之间存在正线性关联;俄语搜索的R2 = 0.1211,乌克兰语搜索的R2 = 0.1831。值得注意的是,乌克兰人更多地使用俄语而不是乌克兰语来搜索术语。乌克兰移民相关搜索活动增加,如[gcy][p][a][ncy][icy][tscy][a] [Rus.][0][p][0][ncy][u] (Ukr.)边境);[Pcy] [o] [lcy] [softcy] [shchcy][一](波兰);[Gcy] [e] [p] [m][一][ncy][冰冷][yacy](俄文。德国),[H] [m] [e] [chcy] [chcy][冰冷][ncy] [a] (Ukr。德国)和[U][gcy][o][p][shchcy][icy][ncy][a]和[V][e][ncy][gcy][p][icy][yacy](匈牙利)与联合国难民署关于乌克兰境外流离失所者的官方数据密切相关。这三种语言都显示出对波兰的兴趣是最高的。当难民抵达附近国家时,与德国相关的搜索词,如穿越边境+德国等,正在激增。这个结果证实了我们的假设,即三分之一的难民将越境进入德国。根据大数据分析,预计难民总数将达到540万。最具代表性的年龄组是24至45岁(没有儿童数据),65%以上是妇女。结论:与移民相关的搜索查询的增加与欧盟乌克兰难民数量的增加相关。因此,这种方法可以进行可靠的预测。了解从乌克兰被迫移徙的后果对于使难民署和各国政府能够制定最佳人道主义战略并为难民接收和可能的融入社会做好准备至关重要。这种方法的好处是可靠的估计和预测,使各国政府和难民署能够做好准备,更好地应对最近的人道主义危机。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信