Global atmospheric data assimilation with multi-modal masked autoencoders

Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn
{"title":"Global atmospheric data assimilation with multi-modal masked autoencoders","authors":"Thomas J. Vandal, Kate Duffy, Daniel McDuff, Yoni Nachmany, Chris Hartshorn","doi":"arxiv-2407.11696","DOIUrl":null,"url":null,"abstract":"Global data assimilation enables weather forecasting at all scales and\nprovides valuable data for studying the Earth system. However, the\ncomputational demands of physics-based algorithms used in operational systems\nlimits the volume and diversity of observations that are assimilated. Here, we\npresent \"EarthNet\", a multi-modal foundation model for data assimilation that\nlearns to predict a global gap-filled atmospheric state solely from satellite\nobservations. EarthNet is trained as a masked autoencoder that ingests a 12\nhour sequence of observations and learns to fill missing data from other\nsensors. We show that EarthNet performs a form of data assimilation producing a\nglobal 0.16 degree reanalysis dataset of 3D atmospheric temperature and\nhumidity at a fraction of the time compared to operational systems. It is shown\nthat the resulting reanalysis dataset reproduces climatology by evaluating a 1\nhour forecast background state against observations. We also show that our 3D\nhumidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60%\nbetween the middle troposphere and lower stratosphere (5 to 20 km altitude) and\nour 3D temperature and humidity are statistically equivalent to the Microwave\nintegrated Retrieval System (MiRS) observations at nearly every level of the\natmosphere. Our results indicate significant promise in using EarthNet for\nhigh-frequency data assimilation and global weather forecasting.","PeriodicalId":501166,"journal":{"name":"arXiv - PHYS - Atmospheric and Oceanic Physics","volume":"307 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Atmospheric and Oceanic Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.11696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Global data assimilation enables weather forecasting at all scales and provides valuable data for studying the Earth system. However, the computational demands of physics-based algorithms used in operational systems limits the volume and diversity of observations that are assimilated. Here, we present "EarthNet", a multi-modal foundation model for data assimilation that learns to predict a global gap-filled atmospheric state solely from satellite observations. EarthNet is trained as a masked autoencoder that ingests a 12 hour sequence of observations and learns to fill missing data from other sensors. We show that EarthNet performs a form of data assimilation producing a global 0.16 degree reanalysis dataset of 3D atmospheric temperature and humidity at a fraction of the time compared to operational systems. It is shown that the resulting reanalysis dataset reproduces climatology by evaluating a 1 hour forecast background state against observations. We also show that our 3D humidity predictions outperform MERRA-2 and ERA5 reanalyses by 10% to 60% between the middle troposphere and lower stratosphere (5 to 20 km altitude) and our 3D temperature and humidity are statistically equivalent to the Microwave integrated Retrieval System (MiRS) observations at nearly every level of the atmosphere. Our results indicate significant promise in using EarthNet for high-frequency data assimilation and global weather forecasting.
利用多模态掩码自动编码器进行全球大气数据同化
全球数据同化使各种尺度的天气预报成为可能,并为研究地球系统提供了宝贵的数据。然而,业务系统中使用的基于物理的算法的计算需求限制了同化观测数据的数量和多样性。在这里,我们将介绍一种用于数据同化的多模式基础模型--"EarthNet",它能够仅通过卫星观测数据预测全球空隙大气状态。EarthNet 被训练成一个遮蔽式自动编码器,它接收 12 小时的观测数据序列,并学习从其他传感器填补缺失数据。我们的研究表明,与业务系统相比,EarthNet 只用了一小部分时间就完成了数据同化,生成了全球 0.16 度的三维大气温度和湿度再分析数据集。通过将 1 小时的预报背景状态与观测数据进行对比评估,结果表明再分析数据集能够再现气候学。我们还表明,我们的三维湿度预测在对流层中层和平流层下层(5 到 20 千米高度)比 MERRA-2 和 ERA5 再分析高出 10%到 60%,而且我们的三维温度和湿度在统计上与微波综合检索系统(MiRS)在大气层几乎每一层的观测数据相当。我们的结果表明,利用地球网进行高频数据同化和全球天气预报大有可为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信