{"title":"Improving Model-Based Carbon Dioxide Datasets Using Deep Learning and Satellite Observations","authors":"Farhan Mustafa;Ming Xu","doi":"10.1109/TGRS.2025.3556309","DOIUrl":null,"url":null,"abstract":"Long-term and regular monitoring of atmospheric carbon dioxide (CO2) is crucial to meet the goals of the Paris Agreement. The satellites reliably estimate atmospheric CO2 concentrations. However, their limitations, such as infrequent revisit periods and atmospheric challenges, create significant spatiotemporal gaps in their data. Additionally, the satellite-based CO2 products are not long-term. On the other hand, the model-derived CO2 data products have an advantage over the satellite datasets in terms of temporal coverage and temporal resolution. However, these model datasets are less consistent than the satellite observations. Using satellite retrievals to enhance model-derived CO2 measurements can produce long-term, more consistent datasets. Therefore, in this study, we developed a deep learning-based hybrid model by combining the convolutional neural network (CNN) and the long short-term memory (LSTM) network to improve the quality of the column-averaged dry-air mole fraction of CO2 (XCO2) estimates from the CAMS-EGG4 reanalysis dataset (from 2003 to 2020) using the Orbiting Carbon Observatory 2 (OCO-2) satellite observations. The proposed model achieved high accuracy, with a coefficient of determination (<inline-formula> <tex-math>${R} ^{2}$ </tex-math></inline-formula>) of 0.99 and root mean squared error (RMSE) of 0.78 ppm. We compared the performance of the deep learning model with other machine learning approaches including the extreme gradient boosting (XGBoost) and the random forest (RF) models. The results showed that the deep learning model outperformed the comparison models in terms of <inline-formula> <tex-math>${R} ^{2}$ </tex-math></inline-formula>, RMSE, and mean absolute error (MAE). In addition, we comprehensively validated the original and the predicted XCO2 measurements against the ground-based observations from the Total Carbon Column Observing Network (TCCON), and the predicted results were significantly improved. The improved results were also compared with the National Oceanic and Atmospheric Administration (NOAA) CarbonTracker XCO2 dataset. Both datasets showed consistent spatial distributions, interannual changes, seasonal variations, and annual growth rates.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10945896/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Long-term and regular monitoring of atmospheric carbon dioxide (CO2) is crucial to meet the goals of the Paris Agreement. The satellites reliably estimate atmospheric CO2 concentrations. However, their limitations, such as infrequent revisit periods and atmospheric challenges, create significant spatiotemporal gaps in their data. Additionally, the satellite-based CO2 products are not long-term. On the other hand, the model-derived CO2 data products have an advantage over the satellite datasets in terms of temporal coverage and temporal resolution. However, these model datasets are less consistent than the satellite observations. Using satellite retrievals to enhance model-derived CO2 measurements can produce long-term, more consistent datasets. Therefore, in this study, we developed a deep learning-based hybrid model by combining the convolutional neural network (CNN) and the long short-term memory (LSTM) network to improve the quality of the column-averaged dry-air mole fraction of CO2 (XCO2) estimates from the CAMS-EGG4 reanalysis dataset (from 2003 to 2020) using the Orbiting Carbon Observatory 2 (OCO-2) satellite observations. The proposed model achieved high accuracy, with a coefficient of determination (${R} ^{2}$ ) of 0.99 and root mean squared error (RMSE) of 0.78 ppm. We compared the performance of the deep learning model with other machine learning approaches including the extreme gradient boosting (XGBoost) and the random forest (RF) models. The results showed that the deep learning model outperformed the comparison models in terms of ${R} ^{2}$ , RMSE, and mean absolute error (MAE). In addition, we comprehensively validated the original and the predicted XCO2 measurements against the ground-based observations from the Total Carbon Column Observing Network (TCCON), and the predicted results were significantly improved. The improved results were also compared with the National Oceanic and Atmospheric Administration (NOAA) CarbonTracker XCO2 dataset. Both datasets showed consistent spatial distributions, interannual changes, seasonal variations, and annual growth rates.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.