{"title":"A soil temperature dataset based on random forest in the Three River Source Region.","authors":"Xiaoqing Tan, Siqiong Luo, Hongmei Li, Zhuoqun Li, Qingxue Dong","doi":"10.1038/s41597-025-04910-3","DOIUrl":null,"url":null,"abstract":"<p><p>Changes in soil temperature (ST) in the Three River Source Region (TRSR) significantly influence regional climate, ecology, and hydrological processes. However, existing models and reanalysis data exhibit considerable deviations in ST due to limitations in physical processes and parameterization schemes. To address this issue, we developed a new ST dataset using the Random Forest method (RFST), integrating observed ST data with relevant gridded datasets. RFST provides monthly ST data at nine layers with a spatial resolution of 0.01° × 0.01° from 1982 to 2015. Validation against two soil observation networks and six meteorological stations shows that the Nash-Sutcliffe Efficiency (NSE) of RFST exceeds 0.7 at all depths. Compared to ERA5 and CRA40, RFST corrects the cold bias, improves NSE, and reduces RMSE from 4 °C-8 °C to 1 °C-2 °C. RFST not only corrects the underestimation of ST and its warming rate but also aligns more closely with observed values for surface freezing and thawing indices as well as soil freeze-thaw periods, providing a more accurate representation of soil thermal conditions in the TRSR.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"882"},"PeriodicalIF":6.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12116792/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04910-3","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Changes in soil temperature (ST) in the Three River Source Region (TRSR) significantly influence regional climate, ecology, and hydrological processes. However, existing models and reanalysis data exhibit considerable deviations in ST due to limitations in physical processes and parameterization schemes. To address this issue, we developed a new ST dataset using the Random Forest method (RFST), integrating observed ST data with relevant gridded datasets. RFST provides monthly ST data at nine layers with a spatial resolution of 0.01° × 0.01° from 1982 to 2015. Validation against two soil observation networks and six meteorological stations shows that the Nash-Sutcliffe Efficiency (NSE) of RFST exceeds 0.7 at all depths. Compared to ERA5 and CRA40, RFST corrects the cold bias, improves NSE, and reduces RMSE from 4 °C-8 °C to 1 °C-2 °C. RFST not only corrects the underestimation of ST and its warming rate but also aligns more closely with observed values for surface freezing and thawing indices as well as soil freeze-thaw periods, providing a more accurate representation of soil thermal conditions in the TRSR.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.