智能电网相量数据集中器的高效数据处理和存储

智能电网与可再生能源(英文) Pub Date : 2022-03-20 DOI:10.1109/SGRE53517.2022.9774225

Kedar V. Khandeparkar, S. Swain, Parth Chaturvedi

{"title":"智能电网相量数据集中器的高效数据处理和存储","authors":"Kedar V. Khandeparkar, S. Swain, Parth Chaturvedi","doi":"10.1109/SGRE53517.2022.9774225","DOIUrl":null,"url":null,"abstract":"The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.","PeriodicalId":64562,"journal":{"name":"智能电网与可再生能源(英文)","volume":"33 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Data Processing and Storage at Phasor Data Concentrators in Smart Grids\",\"authors\":\"Kedar V. Khandeparkar, S. Swain, Parth Chaturvedi\",\"doi\":\"10.1109/SGRE53517.2022.9774225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.\",\"PeriodicalId\":64562,\"journal\":{\"name\":\"智能电网与可再生能源(英文)\",\"volume\":\"33 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"智能电网与可再生能源(英文)\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.1109/SGRE53517.2022.9774225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"智能电网与可再生能源(英文)","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.1109/SGRE53517.2022.9774225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

相量测量单元(pmu)测量电网的多个参数，并向称为相量数据集中器(PDC)的计算节点发送数据包。PDC的数据处理包括数据包的解析、时间对齐和聚合。PDC还可以运行具有严格延迟要求(50到100毫秒)的时间关键型应用程序。流数据包的高容量、高速度和多样性对满足这些应用程序的服务质量(QoS)要求提出了挑战。此外，事件的后期分析需要获取归档在次要存储中的数据，通常称为历史记录。随着时间的推移，数据量的增加会导致存储成本和数据检索时间的增加。在本文中，通过并行解析和基于哈希的内存数据存储来解决数据包时间对齐过程中的数据处理问题，从而将PDC的总体处理时间从pmu数量的二次型减少到线性型。我们还研究了离线应用程序对存储在三种不同存储平台(PostgreSQL、InfluxDB和MongoDB)上的数据的查询性能。实证结果表明，在三个数据库中，从数据库中提取具有选择性列的应用程序查询时，InfluxDB的查询执行时间更短。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient Data Processing and Storage at Phasor Data Concentrators in Smart Grids

The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

智能电网与可再生能源(英文)

自引率

0.00%

发文量

307