机器学习与经验模型对比,预测平均年全球日太阳辐照量:同质平行集合占上风

Keith De Souza
{"title":"机器学习与经验模型对比,预测平均年全球日太阳辐照量:同质平行集合占上风","authors":"Keith De Souza","doi":"10.1115/1.4065978","DOIUrl":null,"url":null,"abstract":"\n Accurate predictive daily global horizontal irradiation models are essential for diverse solar energy applications. Their long-term performances can be assessed using average years. This study scrutinized 70 machine learning and 44 empirical models using two disjoint five-year average daily training and validation datasets, each comprising 365 records and 10 features. The features included day number, minimum and maximum air temperature, air temperature amplitude, theoretical and observed sunshine hours, theoretical extraterrestrial horizontal irradiation, relative sunshine, cloud cover and relative humidity. Fourteen machine learning algorithms, namely, multiple linear regression, ridge regression, lasso regression, elastic net regression, Huber regression, k-nearest neighbors, decision tree, support vector machine, multilayer perceptron, extreme learning machine, generalized regression neural network, extreme gradient boosting, gradient boosting machine and light gradient boosting machine were trained, validated and instantiated as base learners in 4 strategically designed homogeneous parallel ensembles—variants of pasting, random subspace, bagging and random patches—which also were scrutinized, producing 70 models. Specific hyperparameters from the algorithms were optimized. Validation showed that at least two ensembles outperformed its individual model. Huber-subspace ranked first with a root mean square error of 1.495 MJ/m2/day. The multilayer perceptron was most robust to the random perturbations of the ensembles which extrapolates to good tolerance to ground truth data noise. The best empirical model returned a validation root mean square error of 1.595 MJ/m2/day but was outperformed by 93% of the machine learning models with the homogeneous parallel ensembles producing superior predictive accuracies.","PeriodicalId":502733,"journal":{"name":"Journal of Solar Energy Engineering","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning versus empirical models to predict daily global solar irradiation in an average year: Homogeneous parallel ensembles prevailed\",\"authors\":\"Keith De Souza\",\"doi\":\"10.1115/1.4065978\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Accurate predictive daily global horizontal irradiation models are essential for diverse solar energy applications. Their long-term performances can be assessed using average years. This study scrutinized 70 machine learning and 44 empirical models using two disjoint five-year average daily training and validation datasets, each comprising 365 records and 10 features. The features included day number, minimum and maximum air temperature, air temperature amplitude, theoretical and observed sunshine hours, theoretical extraterrestrial horizontal irradiation, relative sunshine, cloud cover and relative humidity. Fourteen machine learning algorithms, namely, multiple linear regression, ridge regression, lasso regression, elastic net regression, Huber regression, k-nearest neighbors, decision tree, support vector machine, multilayer perceptron, extreme learning machine, generalized regression neural network, extreme gradient boosting, gradient boosting machine and light gradient boosting machine were trained, validated and instantiated as base learners in 4 strategically designed homogeneous parallel ensembles—variants of pasting, random subspace, bagging and random patches—which also were scrutinized, producing 70 models. Specific hyperparameters from the algorithms were optimized. Validation showed that at least two ensembles outperformed its individual model. Huber-subspace ranked first with a root mean square error of 1.495 MJ/m2/day. The multilayer perceptron was most robust to the random perturbations of the ensembles which extrapolates to good tolerance to ground truth data noise. The best empirical model returned a validation root mean square error of 1.595 MJ/m2/day but was outperformed by 93% of the machine learning models with the homogeneous parallel ensembles producing superior predictive accuracies.\",\"PeriodicalId\":502733,\"journal\":{\"name\":\"Journal of Solar Energy Engineering\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Solar Energy Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/1.4065978\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Solar Energy Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/1.4065978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

精确的全球日水平辐照预测模型对各种太阳能应用至关重要。这些模型的长期性能可以用平均年数来评估。本研究使用两个不相邻的五年平均日训练和验证数据集(每个数据集由 365 条记录和 10 个特征组成),对 70 个机器学习模型和 44 个经验模型进行了仔细研究。这些特征包括日数、最低和最高气温、气温振幅、理论日照时数和观测日照时数、理论地外水平辐照、相对日照、云量和相对湿度。14 种机器学习算法,即多元线性回归、脊回归、拉索回归、弹性网回归、胡伯回归、k-近邻、决策树、支持向量机、多层感知器、极端学习机、广义回归神经网络、在 4 个战略性设计的同构并行集合(粘贴、随机子空间、袋装和随机补丁的变体)中,作为基础学习器,对极端梯度提升、梯度提升机和轻梯度提升机进行了训练、验证和实例化,并对这些集合进行了仔细检查,共产生了 70 个模型。对算法的特定超参数进行了优化。验证结果表明,至少有两个集合模型的性能优于单个模型。Huber 子空间排名第一,均方根误差为 1.495 兆焦耳/平方米/天。多层感知器对集合的随机扰动最为稳健,这推断出它对地面实况数据噪声具有良好的耐受性。最佳经验模型的验证均方根误差为 1.595 兆焦耳/平方米/天,但有 93% 的机器学习模型的表现要好于均质并行集合模型,均质并行集合模型的预测精度更高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine learning versus empirical models to predict daily global solar irradiation in an average year: Homogeneous parallel ensembles prevailed
Accurate predictive daily global horizontal irradiation models are essential for diverse solar energy applications. Their long-term performances can be assessed using average years. This study scrutinized 70 machine learning and 44 empirical models using two disjoint five-year average daily training and validation datasets, each comprising 365 records and 10 features. The features included day number, minimum and maximum air temperature, air temperature amplitude, theoretical and observed sunshine hours, theoretical extraterrestrial horizontal irradiation, relative sunshine, cloud cover and relative humidity. Fourteen machine learning algorithms, namely, multiple linear regression, ridge regression, lasso regression, elastic net regression, Huber regression, k-nearest neighbors, decision tree, support vector machine, multilayer perceptron, extreme learning machine, generalized regression neural network, extreme gradient boosting, gradient boosting machine and light gradient boosting machine were trained, validated and instantiated as base learners in 4 strategically designed homogeneous parallel ensembles—variants of pasting, random subspace, bagging and random patches—which also were scrutinized, producing 70 models. Specific hyperparameters from the algorithms were optimized. Validation showed that at least two ensembles outperformed its individual model. Huber-subspace ranked first with a root mean square error of 1.495 MJ/m2/day. The multilayer perceptron was most robust to the random perturbations of the ensembles which extrapolates to good tolerance to ground truth data noise. The best empirical model returned a validation root mean square error of 1.595 MJ/m2/day but was outperformed by 93% of the machine learning models with the homogeneous parallel ensembles producing superior predictive accuracies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信