Proceedings of the Platform for Advanced Scientific Computing Conference最新文献_第2页

Graph Contractions for Calculating Correlation Functions in Lattice QCD 计算格QCD中相关函数的图压缩

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593409

Jing Chen, R. Edwards, W. Mao

{"title":"Graph Contractions for Calculating Correlation Functions in Lattice QCD","authors":"Jing Chen, R. Edwards, W. Mao","doi":"10.1145/3592979.3593409","DOIUrl":"https://doi.org/10.1145/3592979.3593409","url":null,"abstract":"Computing correlation functions for many-particle systems in Lattice QCD is vital to extract nuclear physics observables like the energy spectrum of hadrons such as protons. However, this type of calculation has long been considered to be very challenging and computing-resource intensive because of the complex nature of a hadron composed of quarks with many degrees of freedom. In particular, a correlation function can be calculated through a sum of all possible pairs of quark contractions, each of which is a batched tensor contraction, dictated by Wick's theorem. Because the number of terms of this sum can be very large for any hadronic system of interest, fast evaluation of the sum faces several challenges: an extremely large number of contractions, a huge memory footprint at runtime, and the speed of tensor contractions. In this paper, we present a Lattice QCD analysis software suite, Redstar, which addresses these challenges by utilizing novel algorithmic and software engineering methods targeting modern computing platforms such as many-core CPUs and GPUs. In particular, Redstar represents every term in the sum of a correlation function by a graph, applies efficient graph algorithms to reduce the number of contractions to lower the cost of computations, and minimizes the total memory footprint. Moreover, Redstar carries out the contractions on either CPUs or GPUs utilizing an internal and highly efficient Hadron contraction library Specifically, we illustrate some important algorithmic optimizations of Redstar, show various key design features of Hadron library, and present the speedup values due to the optimizations along with performance figures for calculating six correlations functions on four computing platforms.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122040303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Approximation and Optimization of Global Environmental Simulations with Neural Networks 基于神经网络的全局环境模拟逼近与优化

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593418

E. Azmi, J. Meyer, M. Strobl, M. Weimer, Achim Streit

{"title":"Approximation and Optimization of Global Environmental Simulations with Neural Networks","authors":"E. Azmi, J. Meyer, M. Strobl, M. Weimer, Achim Streit","doi":"10.1145/3592979.3593418","DOIUrl":"https://doi.org/10.1145/3592979.3593418","url":null,"abstract":"Solving a system of hundreds of chemical differential equations in environmental simulations has a major computational complexity, and thereby requires high performance computing resources, which is a challenge as the spatio-temporal resolution increases. Machine learning methods and specially deep learning can offer an approximation of simulations with some factor of speed-up while using less compute resources. In this work, we introduce a neural network based approach (ICONET) to forecast trace gas concentrations without executing the traditional compute-intensive atmospheric simulations. ICONET is equipped with a multifeature Long Short Term Memory (LSTM) model to forecast atmospheric chemicals iteratively in time. We generated the training and test dataset, our target dataset for ICONET, by execution of an atmospheric chemistry simulation in ICON-ART. Applying the ICONET trained model to forecast a test dataset results in a good fit of the forecast values to our target dataset. We discussed appropriate metrics to evaluate the quality of models and presented the quality of the ICONET forecasts with RMSE and KGE metrics. The variety in the nature of trace gases limits the model's learning and forecast skills according to the respective trace gas. In addition to the quality of the ICONET forecasts, we described the computational efficiency of ICONET as its run time speed-up in comparison to the run time of the ICON-ART simulation. The ICONET forecast showed a speed-up factor of 3.1 over the run time of the atmospheric chemistry simulation of ICON-ART, which is a significant achievement, especially when considering the importance of ensemble simulation.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"418 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132000032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scaling Resolution of Gigapixel Whole Slide Images Using Spatial Decomposition on Convolutional Neural Networks 基于卷积神经网络空间分解的十亿像素整张幻灯片图像缩放分辨率研究

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593401

A. Tsaris, Josh Romero, T. Kurth, Jacob Hinkle, Hong-Jun Yoon, Feiyi Wang, Sajal Dash, G. Tourassi

{"title":"Scaling Resolution of Gigapixel Whole Slide Images Using Spatial Decomposition on Convolutional Neural Networks","authors":"A. Tsaris, Josh Romero, T. Kurth, Jacob Hinkle, Hong-Jun Yoon, Feiyi Wang, Sajal Dash, G. Tourassi","doi":"10.1145/3592979.3593401","DOIUrl":"https://doi.org/10.1145/3592979.3593401","url":null,"abstract":"Gigapixel images are prevalent in scientific domains ranging from remote sensing, and satellite imagery to microscopy, etc. However, training a deep learning model at the natural resolution of those images has been a challenge in terms of both, overcoming the resource limit (e.g. HBM memory constraints), as well as scaling up to a large number of GPUs. In this paper, we trained Residual neural Networks (ResNet) on 22,528 x 22,528-pixel size images using a distributed spatial decomposition method on 2,304 GPUs on the Summit Supercomputer. We applied our method on a Whole Slide Imaging (WSI) dataset from The Cancer Genome Atlas (TCGA) database. WSI images can be in the size of 100,000 x 100,000 pixels or even larger, and in this work we studied the effect of image resolution on a classification task, while achieving state-of-the-art AUC scores. Moreover, our approach doesn't need pixel-level labels, since we're avoiding patching from the WSI images completely, while adding the capability of training arbitrary large-size images. This is achieved through a distributed spatial decomposition method, by leveraging the non-block fat-tree interconnect network of the Summit architecture, which enabled GPU-to-GPU direct communication. Finally, detailed performance analysis results are shown, as well as a comparison with a data-parallel approach when possible.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128881341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

StyleGAN as a Deconvolutional Operator for Large Eddy Simulations StyleGAN作为大涡模拟的反卷积算子

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593404

J. Castagna, F. Schiavello

引用次数: 0

Data-Driven Whole-Genome Clustering to Detect Geospatial, Temporal, and Functional Trends in SARS-CoV-2 Evolution 数据驱动的全基因组聚类检测SARS-CoV-2进化的地理空间、时间和功能趋势

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593425

Jean Merlet, John H. Lagergren, Verónica G. Melesse Vergara, Mikaela Cashman, C. Bradburne, R. Plowright, E. Gurley, Wayne Joubert, Daniel Jacobson

{"title":"Data-Driven Whole-Genome Clustering to Detect Geospatial, Temporal, and Functional Trends in SARS-CoV-2 Evolution","authors":"Jean Merlet, John H. Lagergren, Verónica G. Melesse Vergara, Mikaela Cashman, C. Bradburne, R. Plowright, E. Gurley, Wayne Joubert, Daniel Jacobson","doi":"10.1145/3592979.3593425","DOIUrl":"https://doi.org/10.1145/3592979.3593425","url":null,"abstract":"Current methods for defining SARS-CoV-2 lineages ignore the vast majority of the SARS-CoV-2 genome. We develop and apply an exhaustive vector comparison method that directly compares all known SARS-CoV-2 genome sequences to produce novel lineage classifications. We utilize data-driven models that (i) accurately capture the complex interactions across the set of all known SARS-CoV-2 genomes, (ii) scale to leadership-class computing systems, and (iii) enable tracking how such strains evolve geospatially over time. We show that during the height of the original Omicron surge, countries across Europe, Asia, and the Americas had a spatially asynchronous distribution of Omicron sub-strains. Moreover, neighboring countries were often dominated by either different clusters of the same variant or different variants altogether throughout the pandemic. Analyses of this kind may suggest a different pattern of epidemiological risk than was understood from conventional data, as well as produce actionable insights and transform our ability to prepare for and respond to current and future biological threats.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126010103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Runtime Steering of Molecular Dynamics Simulations Through In Situ Analysis and Annotation of Collective Variables 通过现场分析和集体变量注释的分子动力学模拟运行时控制

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593420

Silvina Caíno-Lores, M. Cuendet, Jack D. Marquez, E. Kots, Trilce Estrada, E. Deelman, Harel Weinstein, M. Taufer

{"title":"Runtime Steering of Molecular Dynamics Simulations Through In Situ Analysis and Annotation of Collective Variables","authors":"Silvina Caíno-Lores, M. Cuendet, Jack D. Marquez, E. Kots, Trilce Estrada, E. Deelman, Harel Weinstein, M. Taufer","doi":"10.1145/3592979.3593420","DOIUrl":"https://doi.org/10.1145/3592979.3593420","url":null,"abstract":"This paper targets one of the most common simulations on petascale and, very likely, on exascale machines: molecular dynamics (MD) simulations studying the (classical) time evolution of a molecular system at atomic resolution. Specifically, this work addresses the data challenges of MD simulations at exascale through (1) the creation of a data analysis method based on a suite of advanced collective variables (CVs) selected for annotation of structural molecular properties and capturing rare conformational events at runtime, (2) the definition of an in situ framework to automatically identify the frames where the rare events occur during an MD simulation and (3) the integration of both method and framework into two MD workflows for the study of early termination or termination and restart of a benchmark molecular system for protein folding ---the Fs peptide system (Ace-A_5(AAARA)_3A-NME)--- using Summit. The approach achieves faster exploration of the conformational space compared to extensive ensemble simulations. Specifically, our in situ framework with early termination alone achieves 99.6% coverage of the reference conformational space for the Fs peptide with just 60% of the MD steps otherwise used for a traditional execution of the MD simulation. Annotation-based restart allows us to cover 94.6% of the conformational space, just running 50% of the overall MD steps.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133517702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploiting symmetries for preconditioning Poisson's equation in CFD simulations 利用对称性在CFD模拟中预处理泊松方程

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593410

À. Alsalti-Baldellou, C. Janna, X. Álvarez-Farré, F. Trias

{"title":"Exploiting symmetries for preconditioning Poisson's equation in CFD simulations","authors":"À. Alsalti-Baldellou, C. Janna, X. Álvarez-Farré, F. Trias","doi":"10.1145/3592979.3593410","DOIUrl":"https://doi.org/10.1145/3592979.3593410","url":null,"abstract":"Divergence constraints are present in the governing equations of many physical phenomena, and they usually lead to a Poisson equation whose solution is one of the most challenging parts of scientific simulation codes. Indeed, it is the main bottleneck of incompressible Computational Fluid Dynamics (CFD) simulations, and developing efficient and scalable Poisson solvers is a critical task. This work presents an enhanced variant of the Factored Sparse Approximate Inverse (FSAI) preconditioner. It arises from exploiting s spatial reflection symmetries, which are often present in academic and industrial configurations and allow transforming Poisson's equation into a set of 2s fully-decoupled subsystems. Then, we introduce another level of approximation by taking advantage of the subsystems' close similarity and applying the same FSAI to all of them. This leads to substantial memory savings and notable increases in the arithmetic intensity resulting from employing the more compute-intensive sparse matrix-matrix product. Of course, recycling the same preconditioner on all the subsystems worsens its convergence. However, this effect was much smaller than expected and made us introduce relatively cheap but very effective low-rank corrections. A key feature of these corrections is that thanks to being applied to each subsystem independently, the more symmetries being exploited, the more effective they become, leading to up to 5.7x faster convergences than the standard FSAI. Numerical experiments on up to 1.07 billion grids confirm the quality of our low-rank corrected FSAI, which, despite being 2.6x lighter, outperforms the standard FSAI by a factor of up to 4.4x.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"3 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120848712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Universal Data Junction: A Transport Layer for Data Driven Workflows 通用数据枢纽:数据驱动工作流的传输层

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593423

U. Haus, Timothy Dykes, Aniello Esposito, Clément Foyer, Adrian Tate

引用次数: 0

Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes 非结构化网格上非连续Galerkin浅水模型的可扩展多fpga设计

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593407

Jennifer Faj, Tobias Kenter, S. Faghih-Naini, Christian Plessl, V. Aizinger

{"title":"Scalable Multi-FPGA Design of a Discontinuous Galerkin Shallow-Water Model on Unstructured Meshes","authors":"Jennifer Faj, Tobias Kenter, S. Faghih-Naini, Christian Plessl, V. Aizinger","doi":"10.1145/3592979.3593407","DOIUrl":"https://doi.org/10.1145/3592979.3593407","url":null,"abstract":"FPGAs are fostering interest as energy-efficient accelerators for scientific simulations, including for methods operating on unstructured meshes. Considering the potential impact on high-performance computing, specific attention needs to be given to the scalability of such approaches. In this context, the networking capabilites of FPGA hardware and software stacks can play a crucial role to enable solutions that go beyond a traditional host-MPI and accelerator-offload model. In this work, we present the multi-FPGA scaling of a discontinuous Galerkin shallow water model using direct low-latency streaming communication between the FPGAs. To this end, the unstructured mesh defining the spatial domain of the simulation is partitioned, the inter-FPGA network is configured to match the topology of neighboring partitions, and halo communication is overlapped with the dataflow computation pipeline. With this approach, we demonstrate strong scaling on up to eight FPGAs with a parallel efficiency of >80% and execution times per time step of as low as 7.6 μs. At the same time, with weak scaling, the approach allows to simulate larger meshes that would exceed the local memory limits of a single FPGA, now supporting meshes up to more than 100,000 elements and reaching an aggregated performance of up to 6.5 TFLOPs. Finally, a hierarchical partitioning approach allows for better utilization of the FPGA compute resources in some designs and, by mitigating limitations posed by the communication topology, enables simulations with up to 32 partitions on 8 FPGAs.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124436741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Massively Parallel Multi-Scale FE2 Framework for Multi-Trillion Degrees of Freedom Simulations 用于数万亿自由度模拟的大规模并行多尺度FE2框架

Proceedings of the Platform for Advanced Scientific Computing Conference Pub Date : 2023-06-26 DOI: 10.1145/3592979.3593415

C. Moulinec, G. Houzeaux, R. Borrell, Adria Quintanas Corominas, G. Oyarzun, Judicael Grasset, G. Giuntoli, M. Vázquez

{"title":"A Massively Parallel Multi-Scale FE2 Framework for Multi-Trillion Degrees of Freedom Simulations","authors":"C. Moulinec, G. Houzeaux, R. Borrell, Adria Quintanas Corominas, G. Oyarzun, Judicael Grasset, G. Giuntoli, M. Vázquez","doi":"10.1145/3592979.3593415","DOIUrl":"https://doi.org/10.1145/3592979.3593415","url":null,"abstract":"The advent of hybrid CPU and accelerator supercomputers opens the door to extremely large multi-scale simulations. An example of such a multi-scale technique, the FE2 approach, has been designed to simulate material deformations, by getting a better estimation of the material properties, which, in effect, reduces the need to introduce physical modelling at macro-scale level, such as constitutive laws, for instance. Both macro- and micro-scales are solved using the Finite Element method, the micro-scale being resolved at the Gauss points of the macro-scale mesh. As the micro-scale simulations do not require any information from each other, and are thus run concurrently, the stated problem is embarrassingly parallel. The FE2 method therefore directly benefits from hybrid machines, the macro-scale being solved on CPU whereas the micro-scale is offloaded to accelerators. The case of a flat plate, made of different materials is used to illustrate the potential of the method. In order to ensure good load balance on distributed memory machines, weighting based on the type of materials the plate is made of is applied by means of a Space Filling Curve technique. Simulations have been carried out for over 5 trillions of degrees of freedom on up to 2,048 nodes (49,152 CPUs and 12,288 GPUs) of the US DOE Oak Ridge National Laboratory high-end machine, Summit, showing an excellent speed-up for the assembly part of the framework, where the micro-scale is computed on GPU using CUDA.","PeriodicalId":174137,"journal":{"name":"Proceedings of the Platform for Advanced Scientific Computing Conference","volume":"52 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120970003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0