{"title":"智能电网相量数据集中器的高效数据处理和存储","authors":"Kedar V. Khandeparkar, S. Swain, Parth Chaturvedi","doi":"10.1109/SGRE53517.2022.9774225","DOIUrl":null,"url":null,"abstract":"The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.","PeriodicalId":64562,"journal":{"name":"智能电网与可再生能源(英文)","volume":"33 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Data Processing and Storage at Phasor Data Concentrators in Smart Grids\",\"authors\":\"Kedar V. Khandeparkar, S. Swain, Parth Chaturvedi\",\"doi\":\"10.1109/SGRE53517.2022.9774225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.\",\"PeriodicalId\":64562,\"journal\":{\"name\":\"智能电网与可再生能源(英文)\",\"volume\":\"33 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"智能电网与可再生能源(英文)\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.1109/SGRE53517.2022.9774225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"智能电网与可再生能源(英文)","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.1109/SGRE53517.2022.9774225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Data Processing and Storage at Phasor Data Concentrators in Smart Grids
The phasor measurement units (PMUs) measure multiple parameters of the grid and send data packets to compute nodes called phasor data concentrator (PDC). The data processing at PDC involves parsing, time-aligning and aggregation of packets. A PDC also runs time-critical applications having stringent latency requirements (50 to 100 milliseconds). The high volume, velocity and variety of streaming data packets pose a challenge to meeting quality-of-service (QoS) requirements for these applications. Moreover, the post-analysis of events require fetching data archived in the secondary storage typically referred as the Historian. As the volume of data increases over time, it leads to increased storage cost and data retrieval time. In this paper, the problem on data processing is addressed by parallel parsing and hash-based in-memory data storage during time-aligning of packets thereby reducing the overall processing time at PDC from quadratic to linear in terms of the number of PMUs. We have also explored the performance of offline application queries on data stored in three different storage platforms namely, PostgreSQL, InfluxDB and MongoDB. The empirical results show that among the three databases, InfluxDB have lower query execution time when application queries with selective columns are fetched from the database.