大数据分析:特征、工具和技术综述

IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Mohammad Shahnawaz, Manish Kumar
{"title":"大数据分析:特征、工具和技术综述","authors":"Mohammad Shahnawaz, Manish Kumar","doi":"10.1145/3718364","DOIUrl":null,"url":null,"abstract":"Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"208 1","pages":""},"PeriodicalIF":23.8000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comprehensive Survey on Big Data Analytics: Characteristics, Tools and Techniques\",\"authors\":\"Mohammad Shahnawaz, Manish Kumar\",\"doi\":\"10.1145/3718364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.\",\"PeriodicalId\":50926,\"journal\":{\"name\":\"ACM Computing Surveys\",\"volume\":\"208 1\",\"pages\":\"\"},\"PeriodicalIF\":23.8000,\"publicationDate\":\"2025-02-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3718364\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3718364","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

现代计算设备产生大量不同的数据。这意味着通过各种计算设备的快速过渡导致了大数据的产生。高速、海量、多样化的大数据带来了数据不一致、可扩展性、实时分析和工具选择等挑战。尽管针对大数据处理提出了许多解决方案,但它们的范围和有效性往往有限。本调查旨在解决与机器学习(ML)和物联网(IoT)环境相关的大数据挑战缺乏全面分析的问题,特别是关于大数据的7v。它强调了选择合适的工具来解决每个独特的大数据特征的重要性,并提供了有效管理这些挑战的结构化方法。本文系统地回顾了大数据的特点和相关技术,并详细讨论了各种工具及其应用。此外,它还分析了大数据环境下物联网数据分析的现有ML方法和技术。通过系统的文献综述(SLR),我们研究了关键方面,包括核心概念、好处、限制以及大数据对机器学习算法和物联网数据分析的影响。我们强调解决大数据挑战的突破性研究,以影响未来的研究并增强大数据驱动的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Comprehensive Survey on Big Data Analytics: Characteristics, Tools and Techniques
Modern computing devices generate vast amounts of diverse data. It means that a fast transition through various computing devices leads to big data production. Big data with high velocity, volume, and variety presents challenges like data inconsistency, scalability, real-time analysis, and tool selection. Although numerous solutions have been proposed for big data processing, they are often limited in scope and effectiveness. This survey aims to address the lack of comprehensive analysis of big data challenges in relation to machine learning (ML) and the Internet of Things (IoT) environments, particularly concerning the 7Vs of big data. It emphasizes the significance of selecting suitable tools to address each unique big data characteristic, providing a structured approach to manage these challenges effectively. The article systematically reviews big data characteristics and associated techniques, with a detailed discussion of various tools and their applications. Additionally, it analyzes existing ML methods and techniques for IoT data analytics in big data contexts. Through a systematic literature review (SLR), we examine key aspects, including core concepts, benefits, limitations, and the impact of big data on ML algorithms and IoT data analytics. We highlight groundbreaking studies addressing big data challenges to impact future research and enhance big data-driven applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Computing Surveys
ACM Computing Surveys 工程技术-计算机:理论方法
CiteScore
33.20
自引率
0.60%
发文量
372
审稿时长
12 months
期刊介绍: ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信