William Dawson, Louis Beal, Laura E Ratcliff, Martina Stella, Takahito Nakajima, Luigi Genovese
{"title":"Exploratory data science on supercomputers for quantum mechanical calculations","authors":"William Dawson, Louis Beal, Laura E Ratcliff, Martina Stella, Takahito Nakajima, Luigi Genovese","doi":"10.1088/2516-1075/ad4b80","DOIUrl":null,"url":null,"abstract":"Literate programming—the bringing together of program code and natural language narratives—has become a ubiquitous approach in the realm of data science. This methodology is appealing as well for the domain of Density Functional Theory (DFT) calculations, particularly for interactively developing new methodologies and workflows. However, effective use of literate programming is hampered by old programming paradigms and the difficulties associated with using high performance computing (HPC) resources. Here we present two Python libraries that aim to remove these hurdles. First, we describe the PyBigDFT library, which can be used to setup materials or molecular systems and provides high-level access to the wavelet based BigDFT code. We then present the related <monospace>remotemanager</monospace> library, which is able to serialize and execute arbitrary Python functions on remote supercomputers. We show how together these libraries enable transparent access to HPC based DFT calculations and can serve as building blocks for rapid prototyping and data exploration.","PeriodicalId":42419,"journal":{"name":"Electronic Structure","volume":null,"pages":null},"PeriodicalIF":2.9000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Structure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2516-1075/ad4b80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Literate programming—the bringing together of program code and natural language narratives—has become a ubiquitous approach in the realm of data science. This methodology is appealing as well for the domain of Density Functional Theory (DFT) calculations, particularly for interactively developing new methodologies and workflows. However, effective use of literate programming is hampered by old programming paradigms and the difficulties associated with using high performance computing (HPC) resources. Here we present two Python libraries that aim to remove these hurdles. First, we describe the PyBigDFT library, which can be used to setup materials or molecular systems and provides high-level access to the wavelet based BigDFT code. We then present the related remotemanager library, which is able to serialize and execute arbitrary Python functions on remote supercomputers. We show how together these libraries enable transparent access to HPC based DFT calculations and can serve as building blocks for rapid prototyping and data exploration.