Generative AI applied for synthetic data in PMU

IF 4.7 3区工程技术 Q2 ENERGY & FUELS

Energy Reports Pub Date : 2025-06-10 DOI:10.1016/j.egyr.2025.05.062

Felipe Proença de Albuquerque , Eduardo Coelho Marques da Costa , Luisa Helena Bartocci Liboni

{"title":"Generative AI applied for synthetic data in PMU","authors":"Felipe Proença de Albuquerque , Eduardo Coelho Marques da Costa , Luisa Helena Bartocci Liboni","doi":"10.1016/j.egyr.2025.05.062","DOIUrl":null,"url":null,"abstract":"<div><div>The growing deployment of Phasor Measurement Units (PMUs) has enhanced power system observability but introduced new challenges related to data privacy, incompleteness, and measurement quality. To address these issues, this paper proposes a data-driven methodology for generating and completing PMU phasor measurements using Generative Artificial Intelligence. Specifically, we employ Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) trained on real-world PMU datasets to learn the underlying empirical data distributions without assuming predefined statistical models. The proposed deep generative models are evaluated against traditional statistical techniques based on Gaussian Copulas using a suite of distributional similarity metrics, including Kullback–Leibler (KL) divergence, Hellinger distance, Maximum Deviation Nearest Neighbor (MDNN), and the Kolmogorov–Smirnov (KS) test. The GAN model achieved the best distributional fidelity, with KL divergence as low as 0.0106 and Hellinger distance of 0.0435 for voltage signals. In a synthetic data reconstruction task with 0.5% missing values, the GAN reduced the percentage root mean squared error (PRMSE) to 0.52% for voltage and 2.19% for current—significantly outperforming baseline methods. Moreover, the GAN was able to augment the dataset from 1489 to 5000 samples while preserving key statistical properties, as validated by empirical distribution tests. These results demonstrate that deep generative models not only offer superior accuracy but also provide statistically consistent synthetic PMU data, making them a robust alternative to conventional methods for enhancing power system datasets.</div></div>","PeriodicalId":11798,"journal":{"name":"Energy Reports","volume":"14 ","pages":"Pages 103-115"},"PeriodicalIF":4.7000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Reports","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352484725003439","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

The growing deployment of Phasor Measurement Units (PMUs) has enhanced power system observability but introduced new challenges related to data privacy, incompleteness, and measurement quality. To address these issues, this paper proposes a data-driven methodology for generating and completing PMU phasor measurements using Generative Artificial Intelligence. Specifically, we employ Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) trained on real-world PMU datasets to learn the underlying empirical data distributions without assuming predefined statistical models. The proposed deep generative models are evaluated against traditional statistical techniques based on Gaussian Copulas using a suite of distributional similarity metrics, including Kullback–Leibler (KL) divergence, Hellinger distance, Maximum Deviation Nearest Neighbor (MDNN), and the Kolmogorov–Smirnov (KS) test. The GAN model achieved the best distributional fidelity, with KL divergence as low as 0.0106 and Hellinger distance of 0.0435 for voltage signals. In a synthetic data reconstruction task with 0.5% missing values, the GAN reduced the percentage root mean squared error (PRMSE) to 0.52% for voltage and 2.19% for current—significantly outperforming baseline methods. Moreover, the GAN was able to augment the dataset from 1489 to 5000 samples while preserving key statistical properties, as validated by empirical distribution tests. These results demonstrate that deep generative models not only offer superior accuracy but also provide statistically consistent synthetic PMU data, making them a robust alternative to conventional methods for enhancing power system datasets.

查看原文本刊更多论文

生成式人工智能应用于PMU的综合数据

相量测量单元（pmu）的日益普及增强了电力系统的可观测性，但也带来了与数据隐私、不完整性和测量质量相关的新挑战。为了解决这些问题，本文提出了一种数据驱动的方法，用于使用生成式人工智能生成和完成PMU相量测量。具体来说，我们使用在真实世界PMU数据集上训练的变分自编码器（VAEs）和生成对抗网络（GANs）来学习潜在的经验数据分布，而无需假设预定义的统计模型。利用一套分布相似度量，包括Kullback-Leibler （KL）散度、Hellinger距离、最大偏差最近邻（mddnn）和Kolmogorov-Smirnov （KS）检验，对基于高斯copula的传统统计技术进行了深度生成模型的评估。GAN模型获得了最好的分布保真度，电压信号的KL散度低至0.0106，Hellinger距离为0.0435。在一个缺失值为0.5%的合成数据重建任务中，GAN将电压的均方根误差（PRMSE）降低到0.52%，电流的PRMSE降低到2.19%，显著优于基线方法。此外，GAN能够将数据集从1489个样本增加到5000个样本，同时保留关键的统计属性，如经验分布测试所验证的那样。这些结果表明，深度生成模型不仅提供了卓越的准确性，而且还提供了统计一致的合成PMU数据，使其成为增强电力系统数据集的传统方法的强大替代方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Energy Reports Energy-General Energy

CiteScore

8.20

自引率

13.50%

发文量

2608

审稿时长

38 days

期刊介绍： Energy Reports is a new online multidisciplinary open access journal which focuses on publishing new research in the area of Energy with a rapid review and publication time. Energy Reports will be open to direct submissions and also to submissions from other Elsevier Energy journals, whose Editors have determined that Energy Reports would be a better fit.