Machine learning and deep learning models based grid search cross validation for short-term solar irradiance forecasting

IF 8.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Doaa El-Shahat, Ahmed Tolba, Mohamed Abouhawwash, Mohamed Abdel-Basset
{"title":"Machine learning and deep learning models based grid search cross validation for short-term solar irradiance forecasting","authors":"Doaa El-Shahat, Ahmed Tolba, Mohamed Abouhawwash, Mohamed Abdel-Basset","doi":"10.1186/s40537-024-00991-w","DOIUrl":null,"url":null,"abstract":"<p>In late 2023, the United Nations conference on climate change (COP28), which was held in Dubai, encouraged a quick move from fossil fuels to renewable energy. Solar energy is one of the most promising forms of energy that is both sustainable and renewable. Generally, photovoltaic systems transform solar irradiance into electricity. Unfortunately, instability and intermittency in solar radiation can lead to interruptions in electricity production. The accurate forecasting of solar irradiance guarantees sustainable power production even when solar irradiance is not present. Batteries can store solar energy to be used during periods of solar absence. Additionally, deterministic models take into account the specification of technical PV systems and may be not accurate for low solar irradiance. This paper presents a comparative study for the most common Deep Learning (DL) and Machine Learning (ML) algorithms employed for short-term solar irradiance forecasting. The dataset was gathered in Islamabad during a five-year period, from 2015 to 2019, at hourly intervals with accurate meteorological sensors. Furthermore, the Grid Search Cross Validation (GSCV) with five folds is introduced to ML and DL models for optimizing the hyperparameters of these models. Several performance metrics are used to assess the algorithms, such as the <i>Adjusted R</i><sup><i>2</i></sup><i> score</i>, <i>Normalized Root Mean Square Error</i> (NRMSE), <i>Mean Absolute Deviation</i> (MAD), <i>Mean Absolute Error</i> (MAE) and <i>Mean Square Error</i> (MSE). The statistical analysis shows that CNN-LSTM outperforms its counterparts of nine well-known DL models with <i>Adjusted R</i><sup><i>2</i></sup><i> score</i> value of 0.984. For ML algorithms, gradient boosting regression is an effective forecasting method with <i>Adjusted R</i><sup><i>2</i></sup><i> score</i> value of 0.962, beating its rivals of six ML models. Furthermore, SHAP and LIME are examples of explainable Artificial Intelligence (XAI) utilized for understanding the reasons behind the obtained results.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"13 1","pages":""},"PeriodicalIF":8.6000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-024-00991-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

In late 2023, the United Nations conference on climate change (COP28), which was held in Dubai, encouraged a quick move from fossil fuels to renewable energy. Solar energy is one of the most promising forms of energy that is both sustainable and renewable. Generally, photovoltaic systems transform solar irradiance into electricity. Unfortunately, instability and intermittency in solar radiation can lead to interruptions in electricity production. The accurate forecasting of solar irradiance guarantees sustainable power production even when solar irradiance is not present. Batteries can store solar energy to be used during periods of solar absence. Additionally, deterministic models take into account the specification of technical PV systems and may be not accurate for low solar irradiance. This paper presents a comparative study for the most common Deep Learning (DL) and Machine Learning (ML) algorithms employed for short-term solar irradiance forecasting. The dataset was gathered in Islamabad during a five-year period, from 2015 to 2019, at hourly intervals with accurate meteorological sensors. Furthermore, the Grid Search Cross Validation (GSCV) with five folds is introduced to ML and DL models for optimizing the hyperparameters of these models. Several performance metrics are used to assess the algorithms, such as the Adjusted R2 score, Normalized Root Mean Square Error (NRMSE), Mean Absolute Deviation (MAD), Mean Absolute Error (MAE) and Mean Square Error (MSE). The statistical analysis shows that CNN-LSTM outperforms its counterparts of nine well-known DL models with Adjusted R2 score value of 0.984. For ML algorithms, gradient boosting regression is an effective forecasting method with Adjusted R2 score value of 0.962, beating its rivals of six ML models. Furthermore, SHAP and LIME are examples of explainable Artificial Intelligence (XAI) utilized for understanding the reasons behind the obtained results.

Abstract Image

基于网格搜索交叉验证的机器学习和深度学习模型用于短期太阳辐照度预报
2023 年底,在迪拜举行的联合国气候变化大会(COP28)鼓励尽快从化石燃料转向可再生能源。太阳能是最有前途的可持续和可再生能源之一。一般来说,光伏系统将太阳辐照转化为电能。遗憾的是,太阳辐射的不稳定性和间歇性会导致电力生产中断。对太阳辐照度的准确预测可确保即使在没有太阳辐照度的情况下也能持续发电。电池可以储存太阳能,以便在没有太阳能时使用。此外,确定性模型考虑了技术光伏系统的规格,在太阳辐照度较低时可能并不准确。本文对短期太阳辐照度预测中最常用的深度学习(DL)和机器学习(ML)算法进行了比较研究。数据集收集于伊斯兰堡,时间跨度为五年(2015 年至 2019 年),使用精确的气象传感器以小时为间隔进行收集。此外,还为 ML 和 DL 模型引入了网格搜索交叉验证 (GSCV),以优化这些模型的超参数。评估算法时使用了几个性能指标,如调整后 R2 分数、归一化均方根误差(NRMSE)、平均绝对偏差(MAD)、平均绝对误差(MAE)和平均平方误差(MSE)。统计分析结果表明,CNN-LSTM 的调整后 R2 得分为 0.984,优于九种著名的 DL 模型。在 ML 算法中,梯度提升回归是一种有效的预测方法,其调整后 R2 得分为 0.962,优于 6 个 ML 模型的对手。此外,SHAP 和 LIME 是可解释人工智能(XAI)的范例,可用于理解所得结果背后的原因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Big Data
Journal of Big Data Computer Science-Information Systems
CiteScore
17.80
自引率
3.70%
发文量
105
审稿时长
13 weeks
期刊介绍: The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信