Robust resampling and stacked learning models for electricity theft detection in smart grid

IF 4.7 3区工程技术 Q2 ENERGY & FUELS

Energy Reports Pub Date : 2024-12-26 DOI:10.1016/j.egyr.2024.12.041

Ashraf Ullah , Inam Ullah Khan , Muhammad Zeeshan Younas , Maqbool Ahmad , Natalia Kryvinska

{"title":"Robust resampling and stacked learning models for electricity theft detection in smart grid","authors":"Ashraf Ullah , Inam Ullah Khan , Muhammad Zeeshan Younas , Maqbool Ahmad , Natalia Kryvinska","doi":"10.1016/j.egyr.2024.12.041","DOIUrl":null,"url":null,"abstract":"<div><div>Electricity theft (ET) is a critical contributor to non-technical losses (NTLs) that significantly threaten the efficiency and reliability of power grids, leading to increased power wastage and financial losses. Despite the development of various artificial intelligence (AI)-based machine learning (ML) and deep learning (DL) approaches for electricity theft detection (ETD), existing methods often exhibit limitations in memorization and generalization, mainly when applied to large-scale electricity consumption datasets characterized by high variance, missing values, and complex nonlinear relationships. These challenges can result in models needing high variance and bias, reducing their effectiveness in accurately predicting electricity theft cases. To address these limitations, we propose a three-layer framework that employs a stacking ensemble model to combine the benefits of both ML and DL algorithms. During the first stage of data preprocessing, missing data is imputed through data interpolation, while the normalization is done through min–max scaling. To solve the high-class imbalance problem prevalent in most real-world datasets, we combine borderline synthetic minority oversampling techniques and near-miss undersampling strategies. In the final layer of our proposed ETD framework, we employ four ML base and five meta-classifiers. The outputs of base classifiers are aggregated and passed to a meta-classifier, where we evaluate recurrent neural networks (RNN) and convolutional neural network (CNN) as potential meta-classifiers. The RNN are long short-term memory (LSTM), gated recurrent unit (GRU), Bi-directional LSTM (Bi-LSTM) and Bi-directional GRU (Bi-GRU), respectively. Experimental outcomes show that the proposed Bi-GRU better achieves accuracy enhancement of detection in general than meta-classifiers and other state-of-the-art models used for ETD.</div></div>","PeriodicalId":11798,"journal":{"name":"Energy Reports","volume":"13 ","pages":"Pages 770-779"},"PeriodicalIF":4.7000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Reports","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235248472400859X","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

Electricity theft (ET) is a critical contributor to non-technical losses (NTLs) that significantly threaten the efficiency and reliability of power grids, leading to increased power wastage and financial losses. Despite the development of various artificial intelligence (AI)-based machine learning (ML) and deep learning (DL) approaches for electricity theft detection (ETD), existing methods often exhibit limitations in memorization and generalization, mainly when applied to large-scale electricity consumption datasets characterized by high variance, missing values, and complex nonlinear relationships. These challenges can result in models needing high variance and bias, reducing their effectiveness in accurately predicting electricity theft cases. To address these limitations, we propose a three-layer framework that employs a stacking ensemble model to combine the benefits of both ML and DL algorithms. During the first stage of data preprocessing, missing data is imputed through data interpolation, while the normalization is done through min–max scaling. To solve the high-class imbalance problem prevalent in most real-world datasets, we combine borderline synthetic minority oversampling techniques and near-miss undersampling strategies. In the final layer of our proposed ETD framework, we employ four ML base and five meta-classifiers. The outputs of base classifiers are aggregated and passed to a meta-classifier, where we evaluate recurrent neural networks (RNN) and convolutional neural network (CNN) as potential meta-classifiers. The RNN are long short-term memory (LSTM), gated recurrent unit (GRU), Bi-directional LSTM (Bi-LSTM) and Bi-directional GRU (Bi-GRU), respectively. Experimental outcomes show that the proposed Bi-GRU better achieves accuracy enhancement of detection in general than meta-classifiers and other state-of-the-art models used for ETD.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Energy Reports Energy-General Energy

CiteScore

8.20

自引率

13.50%

发文量

2608

审稿时长

38 days

期刊介绍： Energy Reports is a new online multidisciplinary open access journal which focuses on publishing new research in the area of Energy with a rapid review and publication time. Energy Reports will be open to direct submissions and also to submissions from other Elsevier Energy journals, whose Editors have determined that Energy Reports would be a better fit.