{"title":"Contrastive learning for efficient anomaly detection in electricity load data","authors":"Mohit Choubey, Rahul Kumar Chaurasiya, J.S. Yadav","doi":"10.1016/j.segan.2025.101639","DOIUrl":null,"url":null,"abstract":"<div><div>Identifying irregularities in electricity load data is essential for maintaining dependable and effective power systems. Traditional approaches necessitate a significant amount of labeled data in order to achieve high accuracy, resulting in increased costs, and limited scalability. This paper introduces a feature extraction model based on contrastive learning, which greatly enhances the accuracy of anomaly detection for electricity load data. The model generates both positive and negative pairs after utilizing original input data sequences. This enables to learn complex similarities and differences. Through the utilization of a contrastive loss function, the aim is to minimize disparities between positive pairs and maximize the distances between negative pairs, resulting in the extraction of essential feature representations. The results demonstrate significant improvements enhancements such as accuracy rose from 69.85 % to 95.65 %, precision improved from 61.2 % to 96 %, recall increased from 74.5 % to 93 %, and the F1-score saw an improvement from 67.3 % to 94.6 %. The ROC-AUC score rose from 0.7286 to 0.9532, indicating better differentiation between normal and anomalous data. A paired t-test confirmed these gains with p-values well below 0.05, further validating the model’s effectiveness, while Cohen's d test validated the practical significance, indicating large effect sizes across all metrics. Furthermore, 95 % confidence intervals for the mean differences confirmed that the improvements are both statistically and practically meaningful. This approach not only improves detection accuracy but also reduces reliance on large labeled datasets, making it more scalable and cost-effective for real-world applications.</div></div>","PeriodicalId":56142,"journal":{"name":"Sustainable Energy Grids & Networks","volume":"42 ","pages":"Article 101639"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Energy Grids & Networks","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352467725000219","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Identifying irregularities in electricity load data is essential for maintaining dependable and effective power systems. Traditional approaches necessitate a significant amount of labeled data in order to achieve high accuracy, resulting in increased costs, and limited scalability. This paper introduces a feature extraction model based on contrastive learning, which greatly enhances the accuracy of anomaly detection for electricity load data. The model generates both positive and negative pairs after utilizing original input data sequences. This enables to learn complex similarities and differences. Through the utilization of a contrastive loss function, the aim is to minimize disparities between positive pairs and maximize the distances between negative pairs, resulting in the extraction of essential feature representations. The results demonstrate significant improvements enhancements such as accuracy rose from 69.85 % to 95.65 %, precision improved from 61.2 % to 96 %, recall increased from 74.5 % to 93 %, and the F1-score saw an improvement from 67.3 % to 94.6 %. The ROC-AUC score rose from 0.7286 to 0.9532, indicating better differentiation between normal and anomalous data. A paired t-test confirmed these gains with p-values well below 0.05, further validating the model’s effectiveness, while Cohen's d test validated the practical significance, indicating large effect sizes across all metrics. Furthermore, 95 % confidence intervals for the mean differences confirmed that the improvements are both statistically and practically meaningful. This approach not only improves detection accuracy but also reduces reliance on large labeled datasets, making it more scalable and cost-effective for real-world applications.
期刊介绍:
Sustainable Energy, Grids and Networks (SEGAN)is an international peer-reviewed publication for theoretical and applied research dealing with energy, information grids and power networks, including smart grids from super to micro grid scales. SEGAN welcomes papers describing fundamental advances in mathematical, statistical or computational methods with application to power and energy systems, as well as papers on applications, computation and modeling in the areas of electrical and energy systems with coupled information and communication technologies.