Training Deep Surrogate Models with Large Scale Online Learning

Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning Pub Date : 2023-06-28 DOI:10.48550/arXiv.2306.16133

Lucas Meyer, M. Schouler, R. Caulk, Alejandro Rib'es, B. Raffin

{"title":"Training Deep Surrogate Models with Large Scale Online Learning","authors":"Lucas Meyer, M. Schouler, R. Caulk, Alejandro Rib'es, B. Raffin","doi":"10.48550/arXiv.2306.16133","DOIUrl":null,"url":null,"abstract":"The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. This paper advocates that relying on a traditional static dataset to train these models does not allow the full benefit of the solver to be used as a data generator. It proposes an open source online training framework for deep surrogate models. The framework implements several levels of parallelism focused on simultaneously generating numerical simulations and training deep neural networks. This approach suppresses the I/O and storage bottleneck associated with disk-loaded datasets, and opens the way to training on significantly larger datasets. Experiments compare the offline and online training of four surrogate models, including state-of-the-art architectures. Results indicate that exposing deep surrogate models to more dataset diversity, up to hundreds of GB, can increase model generalization capabilities. Fully connected neural networks, Fourier Neural Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved by 68%, 16% and 7%, respectively.","PeriodicalId":74529,"journal":{"name":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","volume":"14 24 1","pages":"24614-24630"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.16133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The spatiotemporal resolution of Partial Differential Equations (PDEs) plays important roles in the mathematical description of the world's physical phenomena. In general, scientists and engineers solve PDEs numerically by the use of computationally demanding solvers. Recently, deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs. Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training. This paper advocates that relying on a traditional static dataset to train these models does not allow the full benefit of the solver to be used as a data generator. It proposes an open source online training framework for deep surrogate models. The framework implements several levels of parallelism focused on simultaneously generating numerical simulations and training deep neural networks. This approach suppresses the I/O and storage bottleneck associated with disk-loaded datasets, and opens the way to training on significantly larger datasets. Experiments compare the offline and online training of four surrogate models, including state-of-the-art architectures. Results indicate that exposing deep surrogate models to more dataset diversity, up to hundreds of GB, can increase model generalization capabilities. Fully connected neural networks, Fourier Neural Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved by 68%, 16% and 7%, respectively.

查看原文本刊更多论文

用大规模在线学习训练深度代理模型

偏微分方程(PDEs)的时空分辨率在对世界物理现象的数学描述中起着重要作用。一般来说，科学家和工程师通过使用计算要求很高的求解器对偏微分方程进行数值求解。最近，深度学习算法已经成为快速求解偏微分方程的可行替代方案。模型通常在求解器生成的合成数据上进行训练，这些数据存储在磁盘上并回读用于训练。本文主张，依靠传统的静态数据集来训练这些模型，并不能充分发挥求解器作为数据生成器的优势。提出了一个开源的深度代理模型在线训练框架。该框架实现了几个级别的并行性，重点是同时生成数值模拟和训练深度神经网络。这种方法抑制了与磁盘负载数据集相关的I/O和存储瓶颈，并为在更大的数据集上进行训练开辟了道路。实验比较了四种代理模型的离线和在线训练，包括最先进的架构。结果表明，将深度代理模型暴露于更多的数据集多样性(高达数百GB)可以提高模型的泛化能力。全连接神经网络、傅立叶神经算子(Fourier neural Operator, FNO)和消息传递PDE求解器的预测精度分别提高了68%、16%和7%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning

自引率

0.00%

发文量