面向高性能计算的基于可扩展机器学习的集成模拟转向

2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC) Pub Date : 2021-10-06 DOI:10.1109/MLHPC54614.2021.00007

Logan T. Ward, G. Sivaraman, J. G. Pauloski, Y. Babuji, Ryan Chard, Naveen K. Dandu, P. Redfern, R. Assary, K. Chard, L. Curtiss, R. Thakur, Ian T. Foster

{"title":"面向高性能计算的基于可扩展机器学习的集成模拟转向","authors":"Logan T. Ward, G. Sivaraman, J. G. Pauloski, Y. Babuji, Ryan Chard, Naveen K. Dandu, P. Redfern, R. Assary, K. Chard, L. Curtiss, R. Thakur, Ian T. Foster","doi":"10.1109/MLHPC54614.2021.00007","DOIUrl":null,"url":null,"abstract":"Scientific applications that involve simulation ensembles can be accelerated greatly by using experiment design methods to select the best simulations to perform. Methods that use machine learning (ML) to create proxy models of simulations show particular promise for guiding ensembles but are challenging to deploy because of the need to coordinate dynamic mixes of simulation and learning tasks. We present Colmena, an open-source Python framework that allows users to steer campaigns by providing just the implementations of individual tasks plus the logic used to choose which tasks to execute when. Colmena handles task dispatch, results collation, ML model invocation, and ML model (re)training, using Parsl to execute tasks on HPC systems. We describe the design of Colmena and illustrate its capabilities by applying it to electrolyte design, where it both scales to 65536 CPUs and accelerates the discovery rate for high-performance molecules by a factor of 100 over unguided searches.","PeriodicalId":101642,"journal":{"name":"2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing\",\"authors\":\"Logan T. Ward, G. Sivaraman, J. G. Pauloski, Y. Babuji, Ryan Chard, Naveen K. Dandu, P. Redfern, R. Assary, K. Chard, L. Curtiss, R. Thakur, Ian T. Foster\",\"doi\":\"10.1109/MLHPC54614.2021.00007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific applications that involve simulation ensembles can be accelerated greatly by using experiment design methods to select the best simulations to perform. Methods that use machine learning (ML) to create proxy models of simulations show particular promise for guiding ensembles but are challenging to deploy because of the need to coordinate dynamic mixes of simulation and learning tasks. We present Colmena, an open-source Python framework that allows users to steer campaigns by providing just the implementations of individual tasks plus the logic used to choose which tasks to execute when. Colmena handles task dispatch, results collation, ML model invocation, and ML model (re)training, using Parsl to execute tasks on HPC systems. We describe the design of Colmena and illustrate its capabilities by applying it to electrolyte design, where it both scales to 65536 CPUs and accelerates the discovery rate for high-performance molecules by a factor of 100 over unguided searches.\",\"PeriodicalId\":101642,\"journal\":{\"name\":\"2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLHPC54614.2021.00007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLHPC54614.2021.00007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

通过使用实验设计方法来选择最佳的模拟来执行，可以大大加快涉及模拟集成的科学应用。使用机器学习(ML)创建模拟代理模型的方法在指导集成方面表现出特别的希望，但由于需要协调模拟和学习任务的动态混合，因此部署具有挑战性。我们介绍了Colmena，这是一个开源Python框架，它允许用户通过提供单个任务的实现以及用于选择何时执行哪些任务的逻辑来引导活动。Colmena处理任务分派、结果整理、ML模型调用和ML模型(重新)训练，使用Parsl在HPC系统上执行任务。我们描述了Colmena的设计，并通过将其应用于电解质设计来说明其能力，在电解质设计中，它既可以扩展到65536个cpu，又可以将高性能分子的发现率提高到非引导搜索的100倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing

Scientific applications that involve simulation ensembles can be accelerated greatly by using experiment design methods to select the best simulations to perform. Methods that use machine learning (ML) to create proxy models of simulations show particular promise for guiding ensembles but are challenging to deploy because of the need to coordinate dynamic mixes of simulation and learning tasks. We present Colmena, an open-source Python framework that allows users to steer campaigns by providing just the implementations of individual tasks plus the logic used to choose which tasks to execute when. Colmena handles task dispatch, results collation, ML model invocation, and ML model (re)training, using Parsl to execute tasks on HPC systems. We describe the design of Colmena and illustrate its capabilities by applying it to electrolyte design, where it both scales to 65536 CPUs and accelerates the discovery rate for high-performance molecules by a factor of 100 over unguided searches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC)

自引率

0.00%

发文量