基于ModelarDB的极端尺度模型时间序列管理(特邀演讲)

Time Pub Date : 2021-01-01 DOI:10.4230/LIPIcs.TIME.2021.2

T. Pedersen

{"title":"基于ModelarDB的极端尺度模型时间序列管理(特邀演讲)","authors":"T. Pedersen","doi":"10.4230/LIPIcs.TIME.2021.2","DOIUrl":null,"url":null,"abstract":"To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [1], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extremescale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[2, 3, 4] 1. Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData2. 2012 ACM Subject Classification Information systems → Data management systems","PeriodicalId":75226,"journal":{"name":"Time","volume":"1 1","pages":"2:1-2:2"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extreme-Scale Model-Based Time Series Management with ModelarDB (Invited Talk)\",\"authors\":\"T. Pedersen\",\"doi\":\"10.4230/LIPIcs.TIME.2021.2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [1], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extremescale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[2, 3, 4] 1. Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData2. 2012 ACM Subject Classification Information systems → Data management systems\",\"PeriodicalId\":75226,\"journal\":{\"name\":\"Time\",\"volume\":\"1 1\",\"pages\":\"2:1-2:2\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Time\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4230/LIPIcs.TIME.2021.2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Time","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.TIME.2021.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

为了监测关键的工业设备，如风力涡轮机，越来越多地使用高质量的高频采样传感器。目前的技术还不能很好地处理这些极端尺度的时间序列，因此传统上只能存储简单的聚合，从而消除了可能表明问题的异常值和波动。作为补救措施，我们提出了一种基于模型的方法来管理极端尺度时间序列，该方法使用数学函数(模型)近似时间序列值，并且仅存储模型系数而不是数据值。压缩既适用于单个时间序列，也适用于相关的时间序列组。主题演讲将介绍基于模型的时间序列管理的概念、技术和算法，以及我们在开源时间序列管理系统(TSMS) ModelarDB[2,3,4] 1中的实现。此外，本文将展示我们在极端尺度真实世界时间序列上对ModelarDB的实验评估，结果表明，与广泛使用的大数据格式相比，ModelarDB由于高压缩而提供了高达14倍的摄取速度，由于其适应性而提供了113倍的压缩速度，使用模型的聚合速度提高了573倍，并且接近线性扩展的可扩展性。ModelarDB正在由衍生公司ModelarData2进行商业化。2012 ACM学科分类信息系统→数据管理系统

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Extreme-Scale Model-Based Time Series Management with ModelarDB (Invited Talk)

To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [1], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extremescale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[2, 3, 4] 1. Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData2. 2012 ACM Subject Classification Information systems → Data management systems

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Time

自引率

0.00%

发文量