Supakan Janthong, Rakkrit Duangsoithong, K. Chalermyanont
{"title":"Feature Extraction of Risk Group and Electricity Theft by using Electrical Profiles and Physical Data for Classification in the Power Utilities","authors":"Supakan Janthong, Rakkrit Duangsoithong, K. Chalermyanont","doi":"10.37936/ecti-cit.2024181.252738","DOIUrl":null,"url":null,"abstract":"Non-technical loss (NTL) is one of the problems that has been a major issue in lost revenue for many years. Electricity distributors have attempted to reduce NTL by detecting electricity theft using various methods. Some events are difficult to detect that conventional meters inspection is inadequate. Moreover, many anomaly patterns found are very complex, confusing in identifying or distinguishing what types of electricity customers are at abnormal risk or energy theft that affects NTL. This paper proposes five key feature extraction methods and six classifying electricity customers using supervised learning. The main problem was studied and collected information, including kilowatt meters, electronic meters, TOU meters, and AMR meters, which cover four customer types that were recorded in the Provincial Electricity Authority (PEA) of Thailand. An electrical profile to be extracted for in-depth analysis of the behavior of each type of electricity customer, combined with the information of physical data to help enhance and increase efficiency. All features examined the relationships in each feature using Pearson correlation and handled unbalanced data using random oversampling (ROS). Then, the extracted data has been trained, validated, and tested to classify three classes: normal, risk, and theft, where we evaluate the results with performance metrics. The results show that random forest (RF) outperforms the rest of the classifiers by achieving a precision-recall area under the curve of 90% and a receiver operating characteristic curve of 78%. Significantly, the results were compared to previous studies and benchmark datasets, which revealed that the proposed method gave better results than other techniques.","PeriodicalId":507234,"journal":{"name":"ECTI Transactions on Computer and Information Technology (ECTI-CIT)","volume":"32 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ECTI Transactions on Computer and Information Technology (ECTI-CIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37936/ecti-cit.2024181.252738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Non-technical loss (NTL) is one of the problems that has been a major issue in lost revenue for many years. Electricity distributors have attempted to reduce NTL by detecting electricity theft using various methods. Some events are difficult to detect that conventional meters inspection is inadequate. Moreover, many anomaly patterns found are very complex, confusing in identifying or distinguishing what types of electricity customers are at abnormal risk or energy theft that affects NTL. This paper proposes five key feature extraction methods and six classifying electricity customers using supervised learning. The main problem was studied and collected information, including kilowatt meters, electronic meters, TOU meters, and AMR meters, which cover four customer types that were recorded in the Provincial Electricity Authority (PEA) of Thailand. An electrical profile to be extracted for in-depth analysis of the behavior of each type of electricity customer, combined with the information of physical data to help enhance and increase efficiency. All features examined the relationships in each feature using Pearson correlation and handled unbalanced data using random oversampling (ROS). Then, the extracted data has been trained, validated, and tested to classify three classes: normal, risk, and theft, where we evaluate the results with performance metrics. The results show that random forest (RF) outperforms the rest of the classifiers by achieving a precision-recall area under the curve of 90% and a receiver operating characteristic curve of 78%. Significantly, the results were compared to previous studies and benchmark datasets, which revealed that the proposed method gave better results than other techniques.