数据流分析研究综述

Sumit Misra, S. Saha, C. Mazumdar
{"title":"数据流分析研究综述","authors":"Sumit Misra, S. Saha, C. Mazumdar","doi":"10.1049/pbpc037f_ch6","DOIUrl":null,"url":null,"abstract":"With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.","PeriodicalId":162132,"journal":{"name":"Handbook of Big Data Analytics. Volume 1: Methodologies","volume":"419 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey on data stream analytics\",\"authors\":\"Sumit Misra, S. Saha, C. Mazumdar\",\"doi\":\"10.1049/pbpc037f_ch6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.\",\"PeriodicalId\":162132,\"journal\":{\"name\":\"Handbook of Big Data Analytics. Volume 1: Methodologies\",\"volume\":\"419 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Handbook of Big Data Analytics. Volume 1: Methodologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/pbpc037f_ch6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Handbook of Big Data Analytics. Volume 1: Methodologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/pbpc037f_ch6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着互联世界的指数级扩张,我们有大量、种类和速度的数据流经系统。对这些系统的依赖已经超过了业务价值的门槛,现在这样的通信已经开始被归类为基本系统。因此,这些系统已经成为重要的社会基础设施,需要所有的预测、监测、安全防护和在发生威胁时立即做出决策。关键的推动者是数据流分析(DSA)。在DSA中,流处理的关键领域包括预测和预测、分类、聚类、挖掘频繁模式和发现频繁项目集(FISs)、检测概念漂移、构建概要结构以回答常设和临时查询、在数据突发的情况下采样和减载,以及处理来自大量典型的物联网(IoT)互连设备的数据流。处理的复杂性受到流数据对象的多维度的影响,将“遗忘”构建为处理中的关键结构,利用时间序列方面来帮助处理等等。在本章中,我们将探讨上述的一些领域,并对这些选定的领域进行调查。我们还提供了一份关于数据流处理系统(dsp)和框架的调查,这些系统和框架正在被整个行业所采用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A survey on data stream analytics
With the exponential expansion of the interconnected world, we have large volume, variety and velocity of the data flowing through the systems. The dependencies on these systems have crossed the threshold of business value, and now such communications have started to be classified as essential systems. As such, these systems have become vital social infrastructure that needs all of prediction, monitoring, safe guard and immediate decision-making in case of threats. The key enabler is data stream analytics (DSA). In DSA, the key areas of stream processing constitute prediction and forecasting, classification, clustering, mining frequent patterns and finding frequent item sets (FISs), detecting concept drift, building synopsis structures to answer standing and ad hoc queries, sampling and loadshedding in the case of bursts of data and processing data streams emanating from a very large number of interconnected devices typical for Internet-of-Things (IoT). The processing complexity is impacted by the multidimensionality of the stream data objects, building `forgetting' as a key construct in the processing, leveraging the time-series aspect to aid the processing and so on. In this chapter, we explore some of the aforementioned areas and provide a survey in each of these selected areas. We also provide a survey on the data stream processing systems (DSPSs) and frameworks that are being adopted by the industry at large.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信