面向大规模数据流日志分析的NoSQL数据存储比较

Khalid Mahmood, Kjell Orsborn, T. Risch
{"title":"面向大规模数据流日志分析的NoSQL数据存储比较","authors":"Khalid Mahmood, Kjell Orsborn, T. Risch","doi":"10.1109/SMARTCOMP.2019.00093","DOIUrl":null,"url":null,"abstract":"With the advent of cyber-physical systems, industrial internet of things (IIoT) and industrial analytics numerous application scenarios have emerged where business and mission-critical decisions depend upon large scale analysis of data in form of sensor streams. However, large volumes of sensor stream data generated at high frequency pose substantial challenges for existing scalable data analysis techniques requiring the use of high-performance distributed datastores. This work covers in-depth performance comparison of three principal categories of distributed state-of the-art NoSQL datastores by evaluating their applicability and efficiency for large-scale analysis of sensor logs from real-world hydraulic power systems. One central datastore is selected from each of the three principal categories of NoSQL datastores: MongoDB from the document store, Cassandra from the column store and Redis from the distributed main memory key-value store to be included in the performance evaluation. Understanding the differences and behavior of this type of systems are crucial for optimizing application performance. Key insights from this work can serve as a basis for an improved understanding of the applicability of NoSQL datastores in systems for large scale data stream analysis. This will be important for supporting data analytics in IIoT applications as found in monitoring and control of Power plants, Smart Cities, Transportation systems, Environmental and Health monitoring, etc.","PeriodicalId":253364,"journal":{"name":"2019 IEEE International Conference on Smart Computing (SMARTCOMP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Comparison of NoSQL Datastores for Large Scale Data Stream Log Analytics\",\"authors\":\"Khalid Mahmood, Kjell Orsborn, T. Risch\",\"doi\":\"10.1109/SMARTCOMP.2019.00093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the advent of cyber-physical systems, industrial internet of things (IIoT) and industrial analytics numerous application scenarios have emerged where business and mission-critical decisions depend upon large scale analysis of data in form of sensor streams. However, large volumes of sensor stream data generated at high frequency pose substantial challenges for existing scalable data analysis techniques requiring the use of high-performance distributed datastores. This work covers in-depth performance comparison of three principal categories of distributed state-of the-art NoSQL datastores by evaluating their applicability and efficiency for large-scale analysis of sensor logs from real-world hydraulic power systems. One central datastore is selected from each of the three principal categories of NoSQL datastores: MongoDB from the document store, Cassandra from the column store and Redis from the distributed main memory key-value store to be included in the performance evaluation. Understanding the differences and behavior of this type of systems are crucial for optimizing application performance. Key insights from this work can serve as a basis for an improved understanding of the applicability of NoSQL datastores in systems for large scale data stream analysis. This will be important for supporting data analytics in IIoT applications as found in monitoring and control of Power plants, Smart Cities, Transportation systems, Environmental and Health monitoring, etc.\",\"PeriodicalId\":253364,\"journal\":{\"name\":\"2019 IEEE International Conference on Smart Computing (SMARTCOMP)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Smart Computing (SMARTCOMP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SMARTCOMP.2019.00093\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Smart Computing (SMARTCOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMARTCOMP.2019.00093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

随着网络物理系统、工业物联网(IIoT)和工业分析的出现,出现了许多应用场景,其中业务和关键任务决策依赖于以传感器流形式对数据的大规模分析。然而,高频率产生的大量传感器流数据对需要使用高性能分布式数据存储的现有可扩展数据分析技术构成了重大挑战。本研究通过评估分布式NoSQL数据存储在大规模分析真实液压系统传感器日志中的适用性和效率,对分布式NoSQL数据存储的三大类进行了深入的性能比较。从NoSQL数据存储的三个主要类别中各选择一个中心数据存储:MongoDB来自文档存储,Cassandra来自列存储,Redis来自分布式主存键值存储,这些数据存储将包括在性能评估中。了解这类系统的差异和行为对于优化应用程序性能至关重要。这项工作的关键见解可以作为改进理解NoSQL数据存储在大规模数据流分析系统中的适用性的基础。这对于支持工业物联网应用中的数据分析非常重要,如发电厂、智能城市、交通系统、环境和健康监测等的监测和控制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparison of NoSQL Datastores for Large Scale Data Stream Log Analytics
With the advent of cyber-physical systems, industrial internet of things (IIoT) and industrial analytics numerous application scenarios have emerged where business and mission-critical decisions depend upon large scale analysis of data in form of sensor streams. However, large volumes of sensor stream data generated at high frequency pose substantial challenges for existing scalable data analysis techniques requiring the use of high-performance distributed datastores. This work covers in-depth performance comparison of three principal categories of distributed state-of the-art NoSQL datastores by evaluating their applicability and efficiency for large-scale analysis of sensor logs from real-world hydraulic power systems. One central datastore is selected from each of the three principal categories of NoSQL datastores: MongoDB from the document store, Cassandra from the column store and Redis from the distributed main memory key-value store to be included in the performance evaluation. Understanding the differences and behavior of this type of systems are crucial for optimizing application performance. Key insights from this work can serve as a basis for an improved understanding of the applicability of NoSQL datastores in systems for large scale data stream analysis. This will be important for supporting data analytics in IIoT applications as found in monitoring and control of Power plants, Smart Cities, Transportation systems, Environmental and Health monitoring, etc.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信