{"title":"预测变量及特征选择在日前电价预测中的重要性","authors":"Lennard Visser, T. Alskaif, W. V. van Sark","doi":"10.1109/SEST48500.2020.9203273","DOIUrl":null,"url":null,"abstract":"Electricity spot market prices are increasingly affected by an expanding amount of renewables and a growing number of market participants. In an attempt to improve forecasting accuracy, this paper evaluates the importance of 62 predictor variables to forecast the the day-ahead electricity price. These variables describe the electricity price, load, generation and weather at different times in the Netherlands, Belgium and Germany. In this study we assess the performance of four machine learning models that forecast the electricity price. Next, we rank the variables according to their importance and identify to what extent different estimators and feature selection methods affect the performance of the forecasting models. We found that Random Forest regression is the best performing model regardless of the number of features selected and the feature selection method applied. Secondly, the performance of all models was not found to improve significantly after the selection of the top 15 ranked variables. Interestingly the top ranked variables differ significantly per selection method. Moreover, the feature selection methods based on Multi-variate Linear Regression and linear kernel Support Vector Machine were found to give the best performance for all models.","PeriodicalId":302157,"journal":{"name":"2020 International Conference on Smart Energy Systems and Technologies (SEST)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"The Importance of Predictor Variables and Feature Selection in Day-ahead Electricity Price Forecasting\",\"authors\":\"Lennard Visser, T. Alskaif, W. V. van Sark\",\"doi\":\"10.1109/SEST48500.2020.9203273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Electricity spot market prices are increasingly affected by an expanding amount of renewables and a growing number of market participants. In an attempt to improve forecasting accuracy, this paper evaluates the importance of 62 predictor variables to forecast the the day-ahead electricity price. These variables describe the electricity price, load, generation and weather at different times in the Netherlands, Belgium and Germany. In this study we assess the performance of four machine learning models that forecast the electricity price. Next, we rank the variables according to their importance and identify to what extent different estimators and feature selection methods affect the performance of the forecasting models. We found that Random Forest regression is the best performing model regardless of the number of features selected and the feature selection method applied. Secondly, the performance of all models was not found to improve significantly after the selection of the top 15 ranked variables. Interestingly the top ranked variables differ significantly per selection method. Moreover, the feature selection methods based on Multi-variate Linear Regression and linear kernel Support Vector Machine were found to give the best performance for all models.\",\"PeriodicalId\":302157,\"journal\":{\"name\":\"2020 International Conference on Smart Energy Systems and Technologies (SEST)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Smart Energy Systems and Technologies (SEST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SEST48500.2020.9203273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Smart Energy Systems and Technologies (SEST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEST48500.2020.9203273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Importance of Predictor Variables and Feature Selection in Day-ahead Electricity Price Forecasting
Electricity spot market prices are increasingly affected by an expanding amount of renewables and a growing number of market participants. In an attempt to improve forecasting accuracy, this paper evaluates the importance of 62 predictor variables to forecast the the day-ahead electricity price. These variables describe the electricity price, load, generation and weather at different times in the Netherlands, Belgium and Germany. In this study we assess the performance of four machine learning models that forecast the electricity price. Next, we rank the variables according to their importance and identify to what extent different estimators and feature selection methods affect the performance of the forecasting models. We found that Random Forest regression is the best performing model regardless of the number of features selected and the feature selection method applied. Secondly, the performance of all models was not found to improve significantly after the selection of the top 15 ranked variables. Interestingly the top ranked variables differ significantly per selection method. Moreover, the feature selection methods based on Multi-variate Linear Regression and linear kernel Support Vector Machine were found to give the best performance for all models.