{"title":"Scalable Integration of Computational Physics Simulations with Machine Learning","authors":"Mathew Boyer, W. Brewer, D. Jude, I. Dettwiller","doi":"10.1109/AI4S56813.2022.00013","DOIUrl":null,"url":null,"abstract":"Integration of machine learning with simulation is part of a growing trend, however, the augmentation of codes in a highly-performant, distributed manner poses a software development challenge. In this work, we explore the question of how to easily augment legacy simulation codes on high-performance computers (HPCs) with machine-learned surrogate models, in a fast, scalable manner. Initial naïve augmentation attempts required significant code modification and resulted in significant slowdown. This led us to explore inference server techniques, which allow for model calls through drop-in functions. In this work, we investigated TensorFlow Serving with $\\mathbf{gRPC}$ and RedisAI with SmartRedis for server-client inference implementations, where the deep learning platform runs as a persistent process on HPC compute node GPUs and the simulation makes client calls while running on the CPUs. We evaluated inference performance for several use cases on SCOUT, an IBM POWER9 supercomputer, including, real gas equations of state, machine-learned boundary conditions for rotorcraft aerodynamics, and super-resolution techniques. We will discuss key findings on performance. The lessons learned may provide useful advice for researchers to augment their simulation codes in an optimal manner.","PeriodicalId":262536,"journal":{"name":"2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AI4S56813.2022.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Integration of machine learning with simulation is part of a growing trend, however, the augmentation of codes in a highly-performant, distributed manner poses a software development challenge. In this work, we explore the question of how to easily augment legacy simulation codes on high-performance computers (HPCs) with machine-learned surrogate models, in a fast, scalable manner. Initial naïve augmentation attempts required significant code modification and resulted in significant slowdown. This led us to explore inference server techniques, which allow for model calls through drop-in functions. In this work, we investigated TensorFlow Serving with $\mathbf{gRPC}$ and RedisAI with SmartRedis for server-client inference implementations, where the deep learning platform runs as a persistent process on HPC compute node GPUs and the simulation makes client calls while running on the CPUs. We evaluated inference performance for several use cases on SCOUT, an IBM POWER9 supercomputer, including, real gas equations of state, machine-learned boundary conditions for rotorcraft aerodynamics, and super-resolution techniques. We will discuss key findings on performance. The lessons learned may provide useful advice for researchers to augment their simulation codes in an optimal manner.