Two-stage estimation of hourly diffuse solar radiation across China using end-to-end gradient boosting with sequentially boosted features

IF 11.1 1区 地球科学 Q1 ENVIRONMENTAL SCIENCES
Lu Chen , Haoze Shi , Hong Tang, Xin Yang, Chao Ji, Zhigang Li, Yuhong Tu
{"title":"Two-stage estimation of hourly diffuse solar radiation across China using end-to-end gradient boosting with sequentially boosted features","authors":"Lu Chen ,&nbsp;Haoze Shi ,&nbsp;Hong Tang,&nbsp;Xin Yang,&nbsp;Chao Ji,&nbsp;Zhigang Li,&nbsp;Yuhong Tu","doi":"10.1016/j.rse.2024.114445","DOIUrl":null,"url":null,"abstract":"<div><div>Diffuse solar radiation (DR) constitutes a vital component of solar energy reaching the surface of the Earth. The demand for extensive temporal and spatial coverage of DR data has intensified in the realms of solar energy harvesting, agriculture, and climate change. However, until now, long-term DR observations have only been available from 17 stations across mainland China. Consequently, there is a pressing need to estimate spatially continuous, high-temporal-resolution DR for large-scale regions in China. The current hindrance to DR estimations stems from the scarcity of stations equipped with DR observations. This study proposes a two-stage strategy to efficiently estimate seamless DR in 2019 at a national scale, leveraging both DR and total solar radiation (TR) observations from numerous stations. In the first stage, the approach generates virtual DR at TR stations by establishing a learned relationship between DR and TR observations. Subsequently, in the second stage, these virtual DR data, in conjunction with satellite and reanalysis datasets, are utilized to estimate national-scale DR. Additionally, a novel model, End-to-end Gradient Boosting with Shortcuts and Feature selection (EGB-SF), is introduced to estimate DR over China. One advantage of this model is its consideration of the impact of sequentially boosted features and their interactions. Embedded shortcut connections fully exploit the influence of existing features on newly introduced ones during the learning process. Beyond enhancing the accuracy of DR estimation, the EGB-SF algorithm can also elucidate the relative importance levels of input features to the model. Moreover, the two-stage strategy outperforms the method of estimating national DR using only DR observations, as evidenced by its superior spatial generalization abilities. Statistical evaluation, collaborative analysis with influencing factors, and comparisons with related products confirm the accuracy and spatial continuity of the DR estimations in this study. These results furnish reliable DR data across China for research in agriculture, climate, solar radiation, and related fields.</div></div>","PeriodicalId":417,"journal":{"name":"Remote Sensing of Environment","volume":"315 ","pages":"Article 114445"},"PeriodicalIF":11.1000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing of Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0034425724004711","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Diffuse solar radiation (DR) constitutes a vital component of solar energy reaching the surface of the Earth. The demand for extensive temporal and spatial coverage of DR data has intensified in the realms of solar energy harvesting, agriculture, and climate change. However, until now, long-term DR observations have only been available from 17 stations across mainland China. Consequently, there is a pressing need to estimate spatially continuous, high-temporal-resolution DR for large-scale regions in China. The current hindrance to DR estimations stems from the scarcity of stations equipped with DR observations. This study proposes a two-stage strategy to efficiently estimate seamless DR in 2019 at a national scale, leveraging both DR and total solar radiation (TR) observations from numerous stations. In the first stage, the approach generates virtual DR at TR stations by establishing a learned relationship between DR and TR observations. Subsequently, in the second stage, these virtual DR data, in conjunction with satellite and reanalysis datasets, are utilized to estimate national-scale DR. Additionally, a novel model, End-to-end Gradient Boosting with Shortcuts and Feature selection (EGB-SF), is introduced to estimate DR over China. One advantage of this model is its consideration of the impact of sequentially boosted features and their interactions. Embedded shortcut connections fully exploit the influence of existing features on newly introduced ones during the learning process. Beyond enhancing the accuracy of DR estimation, the EGB-SF algorithm can also elucidate the relative importance levels of input features to the model. Moreover, the two-stage strategy outperforms the method of estimating national DR using only DR observations, as evidenced by its superior spatial generalization abilities. Statistical evaluation, collaborative analysis with influencing factors, and comparisons with related products confirm the accuracy and spatial continuity of the DR estimations in this study. These results furnish reliable DR data across China for research in agriculture, climate, solar radiation, and related fields.
利用端到端梯度提升和顺序提升特征,分两阶段估算中国各地的每小时漫射太阳辐射量
漫射太阳辐射(DR)是到达地球表面的太阳能的重要组成部分。在太阳能收集、农业和气候变化等领域,对广泛时空覆盖的 DR 数据的需求日益增长。然而,到目前为止,中国大陆只有 17 个观测站提供了长期的 DR 观测数据。因此,迫切需要估算中国大尺度区域的空间连续、高时间分辨率的降水量。目前,DR 估算的障碍主要是缺乏具备 DR 观测能力的站点。本研究提出了一种分两个阶段的策略,利用众多站点的 DR 和太阳辐射总量(TR)观测数据,高效估算 2019 年全国范围内的无缝 DR。在第一阶段,该方法通过建立 DR 和 TR 观测之间的学习关系,在 TR 站生成虚拟 DR。随后,在第二阶段,利用这些虚拟 DR 数据以及卫星和再分析数据集来估算全国范围的 DR。此外,还引入了一个新模型,即带有捷径和特征选择的端到端梯度提升模型(EGB-SF),用于估算中国的降雨量。该模型的优点之一是考虑了连续增强特征及其相互作用的影响。在学习过程中,嵌入式捷径连接充分利用了现有特征对新引入特征的影响。除了提高 DR 估计的准确性,EGB-SF 算法还能阐明输入特征对模型的相对重要程度。此外,两阶段策略的空间泛化能力也优于仅使用灾变观测数据估算国家灾变的方法。统计评估、与影响因素的协同分析以及与相关产品的比较证实了本研究中 DR 估算的准确性和空间连续性。这些结果为农业、气候、太阳辐射及相关领域的研究提供了可靠的全国降雨量数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Remote Sensing of Environment
Remote Sensing of Environment 环境科学-成像科学与照相技术
CiteScore
25.10
自引率
8.90%
发文量
455
审稿时长
53 days
期刊介绍: Remote Sensing of Environment (RSE) serves the Earth observation community by disseminating results on the theory, science, applications, and technology that contribute to advancing the field of remote sensing. With a thoroughly interdisciplinary approach, RSE encompasses terrestrial, oceanic, and atmospheric sensing. The journal emphasizes biophysical and quantitative approaches to remote sensing at local to global scales, covering a diverse range of applications and techniques. RSE serves as a vital platform for the exchange of knowledge and advancements in the dynamic field of remote sensing.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信