Haoxiang Zhang , Sunny Chaudhary , Carlos D. Rodríguez-Gallegos , Tasmiat Rahman
{"title":"Advancements in solar spectral irradiance modelling for photovoltaic systems: A machine learning approach utilising on-site data","authors":"Haoxiang Zhang , Sunny Chaudhary , Carlos D. Rodríguez-Gallegos , Tasmiat Rahman","doi":"10.1016/j.nxener.2025.100320","DOIUrl":null,"url":null,"abstract":"<div><div>Energy yield estimation for photovoltaics (PV) plays a crucial role in the growth of renewable energy. To reduce uncertainty in these estimations, having a spectral resolved irradiance is key. In the field of PV, radiative transfer models (RTMs) and spectroradiometers are commonly utilised to determine spectral solar irradiance, which is crucial for assessing spectral effects. However, these methodologies have inherent limitations; RTMs require precise and complex inputs of aerosol and meteorological data, while spectroradiometers entail significant costs. With the advancement of machine learning (ML) techniques, a data-driven spectral irradiance model is proposed in this study, which only requires the global horizontal irradiance (<em>GHI</em>) measured by pyranometer and the reference cell as input. Spectral data and meteorological data collected by Solar Energy Research Institute of Singapore (SERIS) at four sites across three continents are used for the training and testing of our models. We examined the viability on spectra modelling of three ML techniques including Long Short-Term Memory networks (LSTM), Random Forest (RF) algorithms and Extreme Gradient Boost (XGBoost). XGBoost achieves relatively good accuracy; additionally, the computational cost is much lower compared to LSTM and RF. The proposed ML model shows an overall <em>R</em><sup>2</sup> of 0.974 in comparison with 0.646 of the SMARTS model in the spectrum range 350.4–1052.4 nm. The ML models outperform the SMARTS model particularly under intermediate and overcast conditions. We have also shown that a model trained on data from a specific site cannot be effectively applied to other locations.</div></div>","PeriodicalId":100957,"journal":{"name":"Next Energy","volume":"8 ","pages":"Article 100320"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Next Energy","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949821X25000833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Energy yield estimation for photovoltaics (PV) plays a crucial role in the growth of renewable energy. To reduce uncertainty in these estimations, having a spectral resolved irradiance is key. In the field of PV, radiative transfer models (RTMs) and spectroradiometers are commonly utilised to determine spectral solar irradiance, which is crucial for assessing spectral effects. However, these methodologies have inherent limitations; RTMs require precise and complex inputs of aerosol and meteorological data, while spectroradiometers entail significant costs. With the advancement of machine learning (ML) techniques, a data-driven spectral irradiance model is proposed in this study, which only requires the global horizontal irradiance (GHI) measured by pyranometer and the reference cell as input. Spectral data and meteorological data collected by Solar Energy Research Institute of Singapore (SERIS) at four sites across three continents are used for the training and testing of our models. We examined the viability on spectra modelling of three ML techniques including Long Short-Term Memory networks (LSTM), Random Forest (RF) algorithms and Extreme Gradient Boost (XGBoost). XGBoost achieves relatively good accuracy; additionally, the computational cost is much lower compared to LSTM and RF. The proposed ML model shows an overall R2 of 0.974 in comparison with 0.646 of the SMARTS model in the spectrum range 350.4–1052.4 nm. The ML models outperform the SMARTS model particularly under intermediate and overcast conditions. We have also shown that a model trained on data from a specific site cannot be effectively applied to other locations.