数据流分析研究综述

Handbook of Big Data Analytics. Volume 1: Methodologies Pub Date : 2021-07-07 DOI:10.1049/pbpc037f_ch6

Sumit Misra, S. Saha, C. Mazumdar

{"title":"数据流分析研究综述","authors":"Sumit Misra, S. Saha, C. Mazumdar","doi":"10.1049/pbpc037f_ch6","DOIUrl":null,"url":null,"abstract":"With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.","PeriodicalId":162132,"journal":{"name":"Handbook of Big Data Analytics. Volume 1: Methodologies","volume":"419 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey on data stream analytics\",\"authors\":\"Sumit Misra, S. Saha, C. Mazumdar\",\"doi\":\"10.1049/pbpc037f_ch6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.\",\"PeriodicalId\":162132,\"journal\":{\"name\":\"Handbook of Big Data Analytics. Volume 1: Methodologies\",\"volume\":\"419 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Handbook of Big Data Analytics. Volume 1: Methodologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/pbpc037f_ch6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Handbook of Big Data Analytics. Volume 1: Methodologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/pbpc037f_ch6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

随着互联世界的指数级扩张，我们有大量、种类和速度的数据流经系统。对这些系统的依赖已经超过了业务价值的门槛，现在这样的通信已经开始被归类为基本系统。因此，这些系统已经成为重要的社会基础设施，需要所有的预测、监测、安全防护和在发生威胁时立即做出决策。关键的推动者是数据流分析(DSA)。在DSA中，流处理的关键领域包括预测和预测、分类、聚类、挖掘频繁模式和发现频繁项目集(FISs)、检测概念漂移、构建概要结构以回答常设和临时查询、在数据突发的情况下采样和减载，以及处理来自大量典型的物联网(IoT)互连设备的数据流。处理的复杂性受到流数据对象的多维度的影响，将“遗忘”构建为处理中的关键结构，利用时间序列方面来帮助处理等等。在本章中，我们将探讨上述的一些领域，并对这些选定的领域进行调查。我们还提供了一份关于数据流处理系统(dsp)和框架的调查，这些系统和框架正在被整个行业所采用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A survey on data stream analytics

With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Handbook of Big Data Analytics. Volume 1: Methodologies

自引率

0.00%

发文量