Younggak Shin , Vichhika Moul , Keonwook Kang , Byeongchan Lee
{"title":"PolyPal: A parallel microscale virtual specimen generator","authors":"Younggak Shin , Vichhika Moul , Keonwook Kang , Byeongchan Lee","doi":"10.1016/j.cpc.2024.109458","DOIUrl":null,"url":null,"abstract":"<div><div>We present an open source program, PolyPal, that can generate a polycrystalline virtual specimen in the micrometer scale for atomistic calculations and visualization. Unlike regular meshes or perfect lattices, atomic positions in polycrystalline materials need to be defined before calculations, and the capability of an atom-generation code is evaluated by the maximum size of the virtual specimen it can generate as well as by the efficiency of the necessary input-output process. Present atom-generation codes are implemented in a serial fashion, and the maximum size of the virtual specimen is limited by the on-board memory. Furthermore, it is difficult to handle a single position file with billions of atoms not only because it takes a long time to read in a row but also full domain decomposition takes hours. PolyPal addresses these challenges with a fully parallelized MPI input-output scheme that supports multiple export options on a Linux cluster. It has no limit in the system size with virtually perfect scalability. Additionally by controlling the size distribution and homogeneity of grains, the program can simulate different microstructures, as typically found in the bulk system or in thin-film samples, prepared with different fabrication processes. PolyPal will harness molecular dynamics codes in the coming age of the exascale computing.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> PolyPal</div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/5cpbmrtzbr.1</span><svg><path></path></svg></span></div><div><em>Developer's repository link:</em> <span><span>https://gitlab.com/GeonbuShin/polypal.git</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> GPLv3</div><div><em>Programming language:</em> C++</div><div><em>Nature of problem:</em> There is no open source code capable of generating massive polycyrstalline microstructures that contain billions of atoms. Existing codes run in a single thread, and hence, have a system size limited by the memory resources. Also, a single input/output filestream is not appropriate for a system with a large amount of atoms as file reading and writing would take a prohibitively long time, working as a bottleneck.</div><div><em>Solution method:</em> PolyPal not only creates atomic structures in parallel but also writes positions to multiple files in parallel within the Message Passing Interface (MPI). The parallel computing feature not only enables access to micrometer-scale atomic systems but also accelerates the entire process from atomic generation to file generation. In this code, each subdomain is filled with atoms simultaneously according to the prescribed domain topology with inherent load balancing. The atomic structure is written out to multiple files in parallel; one structure file is generated for each domain assigned to the corresponding node of a computing cluster via MPI-IO, which drastically reduces the input-output time, and makes it possible to handle a large virtual specimen.</div><div><em>Additional comments including restrictions and unusual features:</em> PolyPal can generate a layered polycrystalline structure to mimic a thin-film specimen prepared by deposition. The code utilizes MPI-IO to accelerate file output.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"308 ","pages":"Article 109458"},"PeriodicalIF":7.2000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465524003813","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
We present an open source program, PolyPal, that can generate a polycrystalline virtual specimen in the micrometer scale for atomistic calculations and visualization. Unlike regular meshes or perfect lattices, atomic positions in polycrystalline materials need to be defined before calculations, and the capability of an atom-generation code is evaluated by the maximum size of the virtual specimen it can generate as well as by the efficiency of the necessary input-output process. Present atom-generation codes are implemented in a serial fashion, and the maximum size of the virtual specimen is limited by the on-board memory. Furthermore, it is difficult to handle a single position file with billions of atoms not only because it takes a long time to read in a row but also full domain decomposition takes hours. PolyPal addresses these challenges with a fully parallelized MPI input-output scheme that supports multiple export options on a Linux cluster. It has no limit in the system size with virtually perfect scalability. Additionally by controlling the size distribution and homogeneity of grains, the program can simulate different microstructures, as typically found in the bulk system or in thin-film samples, prepared with different fabrication processes. PolyPal will harness molecular dynamics codes in the coming age of the exascale computing.
Program summary
Program Title: PolyPal
CPC Library link to program files:https://doi.org/10.17632/5cpbmrtzbr.1
Nature of problem: There is no open source code capable of generating massive polycyrstalline microstructures that contain billions of atoms. Existing codes run in a single thread, and hence, have a system size limited by the memory resources. Also, a single input/output filestream is not appropriate for a system with a large amount of atoms as file reading and writing would take a prohibitively long time, working as a bottleneck.
Solution method: PolyPal not only creates atomic structures in parallel but also writes positions to multiple files in parallel within the Message Passing Interface (MPI). The parallel computing feature not only enables access to micrometer-scale atomic systems but also accelerates the entire process from atomic generation to file generation. In this code, each subdomain is filled with atoms simultaneously according to the prescribed domain topology with inherent load balancing. The atomic structure is written out to multiple files in parallel; one structure file is generated for each domain assigned to the corresponding node of a computing cluster via MPI-IO, which drastically reduces the input-output time, and makes it possible to handle a large virtual specimen.
Additional comments including restrictions and unusual features: PolyPal can generate a layered polycrystalline structure to mimic a thin-film specimen prepared by deposition. The code utilizes MPI-IO to accelerate file output.
期刊介绍:
The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper.
Computer Programs in Physics (CPiP)
These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged.
Computational Physics Papers (CP)
These are research papers in, but are not limited to, the following themes across computational physics and related disciplines.
mathematical and numerical methods and algorithms;
computational models including those associated with the design, control and analysis of experiments; and
algebraic computation.
Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.