Tianhao Yi , Lisha Li , Zhiyong Li , Jiaxuan Zhang
{"title":"Evaluating electricity transmission and distribution efficiency using Data Envelopment Analysis Forest with feature importance","authors":"Tianhao Yi , Lisha Li , Zhiyong Li , Jiaxuan Zhang","doi":"10.1016/j.energy.2025.136580","DOIUrl":null,"url":null,"abstract":"<div><div>Data Envelopment Analysis (DEA), a non-parametric method, has been widely used to measure power grid efficiency, which is a crucial metric for measuring progress in energy development. However, efficiency analysis faces difficulties due to high dimensionality and data noise. To mitigate these challenges, the integration of standard DEA models into an ensemble Random Forest structure is proposed, resulting in a new model called DEA Forest for the evaluation of power grid sector performance. Simulation results indicate that this new model exhibits robust and discriminative performance across both low- and high-dimensional inputs and outputs. An empirical analysis of the performance of the Chinese power grid sector is also conducted. Its results demonstrate that DEA Forest can provide discriminative efficiency ranking results in high-dimensional features of power grid compared to other methods. Moreover, a novel feature importance measure is proposed to analyze the decision-making processes based on DEA, which can provide more discriminative importance values under multicollinearity. This measure maintains a ranking similarity of 91.86% with the original results after feature selection, compared to 77.89% achieved by the old method. It also displays key inputs and outputs that affect electricity transmission and distribution efficiency. The results of this study indicate that in the era of big data, regulators and scholars need to consider the impact of high-dimensional features and data noise on efficiency results in large-scale power grid development. When analyzing feature importance or conducting feature selection, it is also necessary to pay attention to the impact of feature correlation.</div></div>","PeriodicalId":11647,"journal":{"name":"Energy","volume":"330 ","pages":"Article 136580"},"PeriodicalIF":9.0000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360544225022224","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Data Envelopment Analysis (DEA), a non-parametric method, has been widely used to measure power grid efficiency, which is a crucial metric for measuring progress in energy development. However, efficiency analysis faces difficulties due to high dimensionality and data noise. To mitigate these challenges, the integration of standard DEA models into an ensemble Random Forest structure is proposed, resulting in a new model called DEA Forest for the evaluation of power grid sector performance. Simulation results indicate that this new model exhibits robust and discriminative performance across both low- and high-dimensional inputs and outputs. An empirical analysis of the performance of the Chinese power grid sector is also conducted. Its results demonstrate that DEA Forest can provide discriminative efficiency ranking results in high-dimensional features of power grid compared to other methods. Moreover, a novel feature importance measure is proposed to analyze the decision-making processes based on DEA, which can provide more discriminative importance values under multicollinearity. This measure maintains a ranking similarity of 91.86% with the original results after feature selection, compared to 77.89% achieved by the old method. It also displays key inputs and outputs that affect electricity transmission and distribution efficiency. The results of this study indicate that in the era of big data, regulators and scholars need to consider the impact of high-dimensional features and data noise on efficiency results in large-scale power grid development. When analyzing feature importance or conducting feature selection, it is also necessary to pay attention to the impact of feature correlation.
期刊介绍:
Energy is a multidisciplinary, international journal that publishes research and analysis in the field of energy engineering. Our aim is to become a leading peer-reviewed platform and a trusted source of information for energy-related topics.
The journal covers a range of areas including mechanical engineering, thermal sciences, and energy analysis. We are particularly interested in research on energy modelling, prediction, integrated energy systems, planning, and management.
Additionally, we welcome papers on energy conservation, efficiency, biomass and bioenergy, renewable energy, electricity supply and demand, energy storage, buildings, and economic and policy issues. These topics should align with our broader multidisciplinary focus.