Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva
{"title":"Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset","authors":"Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva","doi":"10.1175/aies-d-23-0013.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0013.1","url":null,"abstract":"Abstract Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic—primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data—a few thousand samples—in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135× speed-up. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs).","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"79 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135272893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katherine Haynes, Jason Stock, Jack Dostalek, Charles Anderson, Imme Ebert-Uphoff
{"title":"Exploring the Use of Machine Learning to Improve Vertical Profiles of Temperature and Moisture","authors":"Katherine Haynes, Jason Stock, Jack Dostalek, Charles Anderson, Imme Ebert-Uphoff","doi":"10.1175/aies-d-22-0090.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0090.1","url":null,"abstract":"Abstract Vertical profiles of temperature and dewpoint are useful in predicting deep convection that leads to severe weather which threatens property and lives. Currently, forecasters rely on observations from radiosonde launches and numerical weather prediction (NWP) models. Radiosonde observations are, however, temporally and spatially sparse, and NWP models contain inherent errors that influence short-term predictions of high impact events. This work explores using machine learning (ML) to postprocess NWP model forecasts, combining them with satellite data to improve vertical profiles of temperature and dewpoint. We focus on different ML architectures, loss functions, and input features to optimize predictions. Because we are predicting vertical profiles at 256 levels in the atmosphere, this work provides a unique perspective at using ML for 1-D tasks. Compared to baseline profiles from the Rapid Refresh (RAP), ML predictions offer the largest improvement for dewpoint, particularly in the mid- and upper-atmosphere. Temperature improvements are modest, but CAPE values are improved by up to 40%. Feature importance analyses indicate that the ML models are primarily improving incoming RAP biases. While additional model and satellite data offer some improvement to the predictions, architecture choice is more important than feature selection in fine-tuning the results. Our proposed deep residual UNet performs the best by leveraging spatial context from the input RAP profiles; however, the results are remarkably robust across model architecture. Further, uncertainty estimates for every level are well-calibrated and can provide useful information to forecasters.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"13 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136262179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Zanetta, Daniele Nerini, Tom Beucler, Mark A. Liniger
{"title":"Physics-constrained deep learning postprocessing of temperature and humidity","authors":"Francesco Zanetta, Daniele Nerini, Tom Beucler, Mark A. Liniger","doi":"10.1175/aies-d-22-0089.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0089.1","url":null,"abstract":"Abstract Weather forecasting centers currently rely on statistical postprocessing methods to minimize forecast error. This improves skill but can lead to predictions that violate physical principles or disregard dependencies between variables, which can be problematic for downstream applications and for the trustworthiness of postprocessing models, especially when they are based on new machine learning approaches. Building on recent advances in physics-informed machine learning, we propose to achieve physical consistency in deep learning-based postprocessing models by integrating meteorological expertise in the form of analytic equations. Applied to the post-processing of surface weather in Switzerland, we find that constraining a neural network to enforce thermodynamic state equations yields physically-consistent predictions of temperature and humidity without compromising performance. Our approach is especially advantageous when data is scarce, and our findings suggest that incorporating domain expertise into postprocessing models allows the optimization of weather forecast information while satisfying application-specific requirements.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"CE-22 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135267511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donifan Barahona, Katherine H. Breen, Heike Kalesse-Los, Johannes Röttenbacher
{"title":"Deep Learning Parameterization of Vertical Wind Velocity Variability via Constrained Adversarial Training","authors":"Donifan Barahona, Katherine H. Breen, Heike Kalesse-Los, Johannes Röttenbacher","doi":"10.1175/aies-d-23-0025.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0025.1","url":null,"abstract":"Abstract Atmospheric models with typical resolution in the tenths of kilometers cannot resolve the dynamics of air parcel ascent, which varies on scales ranging from tens to hundreds of meters. Small-scale wind fluctuations are thus characterized by a subgrid distribution of vertical wind velocity, W , with standard deviation σ W . The parameterization of σ W is fundamental to the representation of aerosol-cloud interactions, yet it is poorly constrained. Using a novel deep learning technique, this work develops a new parameterization for σ W merging data from global storm-resolving model simulations, high-frequency retrievals of W , and climate reanalysis products. The parameterization reproduces the observed statistics of σ W and leverages learned physical relations from the model simulations to guide extrapolation beyond the observed domain. Incorporating observational data during the training phase was found to be critical for its performance. The parameterization can be applied online within large-scale atmospheric models, or offline using output from weather forecasting and reanalysis products.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven C Hardiman, Adam A Scaife, Annelize van Niekerk, Rachel Prudden, Aled Owen, Samantha V Adams, Tom Dunstan, Nick J Dunstone, Sam Madge
{"title":"Machine learning for non-orographic gravity waves in a climate model","authors":"Steven C Hardiman, Adam A Scaife, Annelize van Niekerk, Rachel Prudden, Aled Owen, Samantha V Adams, Tom Dunstan, Nick J Dunstone, Sam Madge","doi":"10.1175/aies-d-22-0081.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0081.1","url":null,"abstract":"Abstract There is growing use of machine learning algorithms to replicate sub-grid parametrisation schemes in global climate models. Parametrisations rely on approximations, thus there is potential for machine learning to aid improvements. In this study, a neural network is used to mimic the behaviour of the non-orographic gravity wave scheme used in the Met Office climate model, important for stratospheric climate and variability. The neural network is found to require only two of the six inputs used by the parametrisation scheme, suggesting the potential for greater efficiency in this scheme. Use of a one-dimensional mechanistic model is advocated, allowing neural network hyperparameters to be chosen based on emergent features of the coupled system with minimal computational cost, and providing a test bed prior to coupling to a climate model. A climate model simulation, using the neural network in place of the existing parametrisation scheme, is found to accurately generate a quasi-biennial oscillation of the tropical stratospheric winds, and correctly simulate the non-orographic gravity wave variability associated with the El Niño Southern Oscillation and stratospheric polar vortex variability. These internal sources of variability are essential for providing seasonal forecast skill, and the gravity wave forcing associated with them is reproduced without explicit training for these patterns.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135592841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clément Brochet, Laure Raynaud, Nicolas Thome, Matthieu Plu, Clément Rambour
{"title":"Multivariate emulation of kilometer-scale numerical weather predictions with generative adversarial networks: a proof-of-concept","authors":"Clément Brochet, Laure Raynaud, Nicolas Thome, Matthieu Plu, Clément Rambour","doi":"10.1175/aies-d-23-0006.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0006.1","url":null,"abstract":"Emulating numerical weather prediction (NWP) model outputs is important to compute large datasets of weather fields in an efficient way. The purpose of the present paper is to investigate the ability of generative adversarial networks (GAN) to emulate distributions of multivariate outputs (10-meter wind and 2-meter temperature) of a kilometer-scale NWP model. For that purpose, a residual GAN architecture, regularized with spectral normalization, is trained against a kilometer-scale dataset from the AROME ensemble prediction system (AROME-EPS). A wide range of metrics is used for quality assessment, including pixel-wise and multi-scale earth-mover distances, spectral analysis, and correlation length scales. The use of wavelet-based scattering coefficients as meaningful metrics is also presented. The GAN generates samples with good distribution recovery and good skill in average spectrum reconstruction. Important local weather patterns are reproduced with a high level of detail, while the joint generation of multivariate samples matches the underlying AROME-EPS distribution. The different metrics introduced describe the GAN’s behavior in a complementary manner, highlighting the need to go beyond spectral analysis in generation quality assessment. An ablation study then shows that removing variables from the generation process is globally beneficial, pointing at the GAN limitations to leverage cross-variable correlations. The role of absolute positional bias in the training process is also characterized, explaining both accelerated learning and quality-diversity trade-off in the multivariate emulation. These results open perspectives about the use of GAN to enrich NWP ensemble approaches, provided that the aforementioned positional bias is properly controlled.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135273779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William Yik, Sam J. Silva, Andrew Geiss, Duncan Watson-Parris
{"title":"Exploring Randomly Wired Neural Networks for Climate Model Emulation","authors":"William Yik, Sam J. Silva, Andrew Geiss, Duncan Watson-Parris","doi":"10.1175/aies-d-22-0088.1","DOIUrl":"https://doi.org/10.1175/aies-d-22-0088.1","url":null,"abstract":"Abstract Exploring the climate impacts of various anthropogenic emissions scenarios is key to making informed decisions for climate change mitigation and adaptation. State-of-the-art Earth system models can provide detailed insight into these impacts but have a large associated computational cost on a per-scenario basis. This large computational burden has driven recent interest in developing cheap machine learning models for the task of climate model emulation. In this paper, we explore the efficacy of randomly wired neural networks for this task. We describe how they can be constructed and compare them with their standard feedforward counterparts using the ClimateBench dataset. Specifically, we replace the serially connected dense layers in multilayer perceptrons, convolutional neural networks, and convolutional long short-term memory networks with randomly wired dense layers and assess the impact on model performance for models with 1 million and 10 million parameters. We find that models with less-complex architectures see the greatest performance improvement with the addition of random wiring (up to 30.4% for multilayer perceptrons). Furthermore, of 24 different model architecture, parameter count, and prediction task combinations, only one had a statistically significant performance deficit in randomly wired networks relative to their standard counterparts, with 14 cases showing statistically significant improvement. We also find no significant difference in prediction speed between networks with standard feedforward dense layers and those with randomly wired layers. These findings indicate that randomly wired neural networks may be suitable direct replacements for traditional dense layers in many standard models. Significance Statement Modeling various greenhouse gas and aerosol emissions scenarios is important for both understanding climate change and making informed political and economic decisions. However, accomplishing this with large Earth system models is a complex and computationally expensive task. As such, data-driven machine learning models have risen in prevalence as cheap emulators of Earth system models. In this work, we explore a special type of machine learning model called randomly wired neural networks and find that they perform competitively for the task of climate model emulation. This indicates that future machine learning models for emulation may significantly benefit from using randomly wired neural networks as opposed to their more-standard counterparts.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136307427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Equity, Inclusion, and Justice: An Opportunity for Action for AMS Publications Stakeholders","authors":"_ _","doi":"10.1175/aies-d-23-0072.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0072.1","url":null,"abstract":"","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135275315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kirstine I. Dale, Edward C. D. Pope, Aaron R. Hopkinson, Theo McCaie, Jason A. Lowe
{"title":"Environment-aware digital twins: incorporating weather and climate information to support risk-based decision-making.","authors":"Kirstine I. Dale, Edward C. D. Pope, Aaron R. Hopkinson, Theo McCaie, Jason A. Lowe","doi":"10.1175/aies-d-23-0023.1","DOIUrl":"https://doi.org/10.1175/aies-d-23-0023.1","url":null,"abstract":"Abstract Digital twins are a transformative technology that can significantly strengthen climate adaptation and mitigation decision-making. Through provision of dynamic, virtual representations of physical systems, making intelligent use of multidisciplinary data, and high-fidelity simulations they equip decision-makers with the information they need, when they need it, marking a step change in how we extract value from data and models. While digital twins are commonplace in some industrial sectors, they are an emerging concept in the environmental sciences and practical demonstrations are limited, partly due to the challenges of representing complex environmental systems. Collaboration on challenges of mutual interest will unlock digital twins’ potential. To bridge the current gap between digital twins for industrial sectors and those of the environment, we identify the need for “environment aware” digital twins (EA-DT) that are a federation of digital twins of environmentally sensitive systems with weather, climate, and environmental information systems. As weather extremes become more frequent and severe, the importance of building weather, climate, and environmental information into digital twins of critical systems such as cities, ports, flood barriers, energy grids, and transport networks increases. Delivering societal benefits will also require significant advances in climate-related decision-making, which lags behind other applications. Progress relies on moving beyond heuristics, and driving advances in the decision sciences informed by new theoretical insights, machine learning and artificial intelligence. To support the use of EA-DTs, we propose a new ontology that stimulates thinking about application and best practice for decision-making so that we are resilient to the challenges of today’s weather and tomorrow’s climate.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"10 35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135324729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}