云计算中的大数据分析:概述。

Blend Berisha, Endrit Mëziu, Isak Shabani
{"title":"云计算中的大数据分析:概述。","authors":"Blend Berisha,&nbsp;Endrit Mëziu,&nbsp;Isak Shabani","doi":"10.1186/s13677-022-00301-w","DOIUrl":null,"url":null,"abstract":"<p><p>Big Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable to deal with them. Besides being big, this data moves fast and has a lot of variety. Big Data is a concept that deals with storing, processing and analyzing large amounts of data. Cloud computing on the other hand is about offering the infrastructure to enable such processes in a cost-effective and efficient manner. Many sectors, including among others businesses (small or large), healthcare, education, etc. are trying to leverage the power of Big Data. In healthcare, for example, Big Data is being used to reduce costs of treatment, predict outbreaks of pandemics, prevent diseases etc. This paper, presents an overview of Big Data Analytics as a crucial process in many fields and sectors. We start by a brief introduction to the concept of Big Data, the amount of data that is generated on a daily bases, features and characteristics of Big Data. We then delve into Big Data Analytics were we discuss issues such as analytics cycle, analytics benefits and the movement from ETL to ELT paradigm as a result of Big Data analytics in Cloud. As a case study we analyze Google's BigQuery which is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. As a Platform as a Service (PaaS) supports querying using ANSI SQL. We use the tool to perform different experiments such as average read, average compute, average write, on different sizes of datasets.</p>","PeriodicalId":520665,"journal":{"name":"Journal of cloud computing (Heidelberg, Germany)","volume":" ","pages":"24"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9362456/pdf/","citationCount":"19","resultStr":"{\"title\":\"Big data analytics in Cloud computing: an overview.\",\"authors\":\"Blend Berisha,&nbsp;Endrit Mëziu,&nbsp;Isak Shabani\",\"doi\":\"10.1186/s13677-022-00301-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Big Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable to deal with them. Besides being big, this data moves fast and has a lot of variety. Big Data is a concept that deals with storing, processing and analyzing large amounts of data. Cloud computing on the other hand is about offering the infrastructure to enable such processes in a cost-effective and efficient manner. Many sectors, including among others businesses (small or large), healthcare, education, etc. are trying to leverage the power of Big Data. In healthcare, for example, Big Data is being used to reduce costs of treatment, predict outbreaks of pandemics, prevent diseases etc. This paper, presents an overview of Big Data Analytics as a crucial process in many fields and sectors. We start by a brief introduction to the concept of Big Data, the amount of data that is generated on a daily bases, features and characteristics of Big Data. We then delve into Big Data Analytics were we discuss issues such as analytics cycle, analytics benefits and the movement from ETL to ELT paradigm as a result of Big Data analytics in Cloud. As a case study we analyze Google's BigQuery which is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. As a Platform as a Service (PaaS) supports querying using ANSI SQL. We use the tool to perform different experiments such as average read, average compute, average write, on different sizes of datasets.</p>\",\"PeriodicalId\":520665,\"journal\":{\"name\":\"Journal of cloud computing (Heidelberg, Germany)\",\"volume\":\" \",\"pages\":\"24\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9362456/pdf/\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of cloud computing (Heidelberg, Germany)\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s13677-022-00301-w\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/8/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of cloud computing (Heidelberg, Germany)","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13677-022-00301-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/8/6 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

大数据和云计算作为两大主流技术,是当今IT界关注的焦点。每天都有大量的数据从不同的来源产生。这些数据是如此之大,传统的处理工具无法处理它们。除了数据量大之外,这些数据移动速度快,种类繁多。大数据是一个涉及存储、处理和分析大量数据的概念。另一方面,云计算是提供基础设施,以经济有效的方式实现这些流程。许多行业,包括企业(或大或小)、医疗保健、教育等,都在试图利用大数据的力量。例如,在医疗保健领域,大数据正被用于降低治疗成本、预测流行病爆发、预防疾病等。本文概述了大数据分析作为许多领域和部门的关键过程。我们首先简要介绍大数据的概念、每天产生的数据量、大数据的特点和特点。然后,我们深入探讨了大数据分析,讨论了分析周期、分析收益以及由于云中的大数据分析从ETL到ELT范式的转变等问题。作为一个案例研究,我们分析了Google的BigQuery,它是一个完全托管的、无服务器的数据仓库,可以对pb级的数据进行可扩展分析。即平台即服务(PaaS)支持使用ANSI SQL查询。我们使用该工具在不同大小的数据集上执行不同的实验,如平均读取,平均计算,平均写入。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Big data analytics in Cloud computing: an overview.

Big data analytics in Cloud computing: an overview.

Big data analytics in Cloud computing: an overview.

Big data analytics in Cloud computing: an overview.

Big Data and Cloud Computing as two mainstream technologies, are at the center of concern in the IT field. Every day a huge amount of data is produced from different sources. This data is so big in size that traditional processing tools are unable to deal with them. Besides being big, this data moves fast and has a lot of variety. Big Data is a concept that deals with storing, processing and analyzing large amounts of data. Cloud computing on the other hand is about offering the infrastructure to enable such processes in a cost-effective and efficient manner. Many sectors, including among others businesses (small or large), healthcare, education, etc. are trying to leverage the power of Big Data. In healthcare, for example, Big Data is being used to reduce costs of treatment, predict outbreaks of pandemics, prevent diseases etc. This paper, presents an overview of Big Data Analytics as a crucial process in many fields and sectors. We start by a brief introduction to the concept of Big Data, the amount of data that is generated on a daily bases, features and characteristics of Big Data. We then delve into Big Data Analytics were we discuss issues such as analytics cycle, analytics benefits and the movement from ETL to ELT paradigm as a result of Big Data analytics in Cloud. As a case study we analyze Google's BigQuery which is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. As a Platform as a Service (PaaS) supports querying using ANSI SQL. We use the tool to perform different experiments such as average read, average compute, average write, on different sizes of datasets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信