利用大数据工具分析新冠肺炎疫情对伊斯坦布尔交通的影响

IF 0.6 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Electrica Pub Date : 2022-06-06 DOI:10.54614/electrica.2022.210005

Ugur Alcan, F. Kaçar

{"title":"利用大数据工具分析新冠肺炎疫情对伊斯坦布尔交通的影响","authors":"Ugur Alcan, F. Kaçar","doi":"10.54614/electrica.2022.210005","DOIUrl":null,"url":null,"abstract":"With the internet brought along by technology, people have started to produce data in almost all their jobs. We create a huge data source with many activities we cannot count, such as sending messages on Whatsapp, sharing photos on Instagram, searching in Google, and sending electronic mails (email) and this process is repeated every single day. Such dense and different data also lead to information garbage. Analyzing this dump with traditional technologies has been another problem. Big companies that are interested to analyze this mass information, analyze the behavior of their customers, and determine their strategies according to the results obtained have come up with the concept of big data. Big data are the form of the data we obtain from different sources such as social media shares, sensor data, photo archives, call records obtained from Global System for Mobile Communications (GSM) operators, and search engine statistics, into a meaningful and processable form [1]. In this study, the effect of the coronavirus disease 2019 pandemic, which is an important problem of today, on Istanbul traffic has been examined by using the power of big data technologies. In this context, the hourly traffic index of the 2020 dataset which has openly been published by Istanbul Metropolitan Municipality [2], and the curfew time dataset is discussed. Apache Spark, a new generation data processing tool, has been used in the analysis of these datasets. With Apache Spark, first, general analysis of the Istanbul traffic index data for 2020 has been carried out, and then, the data obtained have been checked whether it is associated with the curfew time dataset and impact analysis has been performed. Elasticsearch has been utilized to keep the processed data, and Kibana has been used for data visualization. At the end of the study, machine learning applications on traffic density have been enhanced using Apache Spark's machine learning library, Application Programming Interface (API) with logistic regression, decision trees, random forest, gradient-boosted tree-based OneVsRest, and linear support vector machine-based OneVsRest methods.","PeriodicalId":36781,"journal":{"name":"Electrica","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impact Analysis of COVID-19 Pandemic on Istanbul Traffic with Big Data Tools\",\"authors\":\"Ugur Alcan, F. Kaçar\",\"doi\":\"10.54614/electrica.2022.210005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the internet brought along by technology, people have started to produce data in almost all their jobs. We create a huge data source with many activities we cannot count, such as sending messages on Whatsapp, sharing photos on Instagram, searching in Google, and sending electronic mails (email) and this process is repeated every single day. Such dense and different data also lead to information garbage. Analyzing this dump with traditional technologies has been another problem. Big companies that are interested to analyze this mass information, analyze the behavior of their customers, and determine their strategies according to the results obtained have come up with the concept of big data. Big data are the form of the data we obtain from different sources such as social media shares, sensor data, photo archives, call records obtained from Global System for Mobile Communications (GSM) operators, and search engine statistics, into a meaningful and processable form [1]. In this study, the effect of the coronavirus disease 2019 pandemic, which is an important problem of today, on Istanbul traffic has been examined by using the power of big data technologies. In this context, the hourly traffic index of the 2020 dataset which has openly been published by Istanbul Metropolitan Municipality [2], and the curfew time dataset is discussed. Apache Spark, a new generation data processing tool, has been used in the analysis of these datasets. With Apache Spark, first, general analysis of the Istanbul traffic index data for 2020 has been carried out, and then, the data obtained have been checked whether it is associated with the curfew time dataset and impact analysis has been performed. Elasticsearch has been utilized to keep the processed data, and Kibana has been used for data visualization. At the end of the study, machine learning applications on traffic density have been enhanced using Apache Spark's machine learning library, Application Programming Interface (API) with logistic regression, decision trees, random forest, gradient-boosted tree-based OneVsRest, and linear support vector machine-based OneVsRest methods.\",\"PeriodicalId\":36781,\"journal\":{\"name\":\"Electrica\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electrica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54614/electrica.2022.210005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electrica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54614/electrica.2022.210005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

随着技术带来的互联网，人们开始在几乎所有的工作中产生数据。我们创建了一个庞大的数据源，其中包含许多我们无法计数的活动，例如在Whatsapp上发送消息、在Instagram上共享照片、在谷歌中搜索和发送电子邮件（电子邮件），这个过程每天都在重复。如此密集和不同的数据也会导致信息垃圾。用传统技术分析这个转储一直是另一个问题。有兴趣分析这些海量信息、分析客户行为并根据所获得的结果确定策略的大公司已经提出了大数据的概念。大数据是我们从不同来源获得的数据的形式，如社交媒体共享、传感器数据、照片档案、从全球移动通信系统（GSM）运营商获得的通话记录和搜索引擎统计数据，形成有意义和可处理的形式[1]。在这项研究中，利用大数据技术的力量研究了2019冠状病毒病大流行对伊斯坦布尔交通的影响，这是当今的一个重要问题。在此背景下，讨论了伊斯坦布尔大都会公开发布的2020年数据集的小时交通指数[2]和宵禁时间数据集。Apache Spark是新一代数据处理工具，已用于分析这些数据集。使用Apache Spark，首先对2020年伊斯坦布尔交通指数数据进行了一般分析，然后检查所获得的数据是否与宵禁时间数据集相关，并进行了影响分析。Elasticsearch用于保存处理后的数据，Kibana用于数据可视化。在研究的最后，使用Apache Spark的机器学习库、具有逻辑回归的应用编程接口（API）、决策树、随机森林、基于梯度树的OneVsRest和基于线性支持向量机的OneVsBest方法，增强了关于流量密度的机器学习应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Impact Analysis of COVID-19 Pandemic on Istanbul Traffic with Big Data Tools

With the internet brought along by technology, people have started to produce data in almost all their jobs. We create a huge data source with many activities we cannot count, such as sending messages on Whatsapp, sharing photos on Instagram, searching in Google, and sending electronic mails (email) and this process is repeated every single day. Such dense and different data also lead to information garbage. Analyzing this dump with traditional technologies has been another problem. Big companies that are interested to analyze this mass information, analyze the behavior of their customers, and determine their strategies according to the results obtained have come up with the concept of big data. Big data are the form of the data we obtain from different sources such as social media shares, sensor data, photo archives, call records obtained from Global System for Mobile Communications (GSM) operators, and search engine statistics, into a meaningful and processable form [1]. In this study, the effect of the coronavirus disease 2019 pandemic, which is an important problem of today, on Istanbul traffic has been examined by using the power of big data technologies. In this context, the hourly traffic index of the 2020 dataset which has openly been published by Istanbul Metropolitan Municipality [2], and the curfew time dataset is discussed. Apache Spark, a new generation data processing tool, has been used in the analysis of these datasets. With Apache Spark, first, general analysis of the Istanbul traffic index data for 2020 has been carried out, and then, the data obtained have been checked whether it is associated with the curfew time dataset and impact analysis has been performed. Elasticsearch has been utilized to keep the processed data, and Kibana has been used for data visualization. At the end of the study, machine learning applications on traffic density have been enhanced using Apache Spark's machine learning library, Application Programming Interface (API) with logistic regression, decision trees, random forest, gradient-boosted tree-based OneVsRest, and linear support vector machine-based OneVsRest methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Electrica Engineering-Electrical and Electronic Engineering

CiteScore

2.10

自引率

0.00%

发文量