International Journal of High Performance Computing Applications最新文献

Result-Scalability: Following the Evolution of Selected Social Impact of HPC. 结果-可扩展性：跟随HPC选择社会影响的演变。

IF 2.5 3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2025-04-24 DOI: 10.1177/10943420251338168

Sally Ellingson, Guillaume Pallez

引用次数: 0

TwoFold: Highly accurate structure and affinity prediction for protein-ligand complexes from sequences 双重:高度精确的结构和亲和预测从序列的蛋白质配体复合物

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-10-30 DOI: 10.1177/10943420231201151

Darren J Hsu, Hao Lu, Aditya Kashi, Michael Matheson, John Gounley, Feiyi Wang, Wayne Joubert, Jens Glaser

引用次数: 0

GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics GenSLMs:基因组尺度语言模型揭示了SARS-CoV-2的进化动态

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-10-27 DOI: 10.1177/10943420231201154

Maxim Zvyagin, Alexander Brace, Kyle Hippe, Yuntian Deng, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, Defne G. Ozgulbas, Natalia Vassilieva, James Gregory Pauloski, Logan Ward, Valerie Hayot-Sasson, Murali Emani, Sam Foreman, Zhen Xie, Diangen Lin, Maulik Shukla, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Rick Stevens, Anima Anandkumar, Venkatram Vishwanath, Arvind Ramanathan

{"title":"GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics","authors":"Maxim Zvyagin, Alexander Brace, Kyle Hippe, Yuntian Deng, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, Defne G. Ozgulbas, Natalia Vassilieva, James Gregory Pauloski, Logan Ward, Valerie Hayot-Sasson, Murali Emani, Sam Foreman, Zhen Xie, Diangen Lin, Maulik Shukla, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Rick Stevens, Anima Anandkumar, Venkatram Vishwanath, Arvind Ramanathan","doi":"10.1177/10943420231201154","DOIUrl":"https://doi.org/10.1177/10943420231201154","url":null,"abstract":"We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole-genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136311318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

General framework for re-assuring numerical reliability in parallel Krylov solvers: A case of bi-conjugate gradient stabilized methods 再保证并行Krylov解数值可靠性的一般框架:双共轭梯度稳定方法的一个例子

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-10-25 DOI: 10.1177/10943420231207642

Roman Iakymchuk, Stef Graillat, José I. Aliaga

{"title":"General framework for re-assuring numerical reliability in parallel Krylov solvers: A case of bi-conjugate gradient stabilized methods","authors":"Roman Iakymchuk, Stef Graillat, José I. Aliaga","doi":"10.1177/10943420231207642","DOIUrl":"https://doi.org/10.1177/10943420231207642","url":null,"abstract":"Parallel implementations of Krylov subspace methods often help to accelerate the procedure of finding an approximate solution of a linear system. However, such parallelization coupled with asynchronous and out-of-order execution often makes more visible the non-associativity impact in floating-point operations. These problems are even amplified when communication-hiding pipelined algorithms are used to improve the parallelization of Krylov subspace methods. Introducing reproducibility in the implementations avoids these problems by getting more robust and correct solutions. This paper proposes a general framework for deriving reproducible and accurate variants of Krylov subspace methods. The proposed algorithmic strategies are reinforced by programmability suggestions to assure deterministic and accurate executions. The framework is illustrated on the preconditioned BiCGStab method and its pipelined modification, which in fact is a distinctive method from the Krylov subspace family, for the solution of non-symmetric linear systems with message-passing. Finally, we verify the numerical behavior of the two reproducible variants of BiCGStab on a set of matrices from the SuiteSparse Matrix Collection and a 3D Poisson’s equation.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"74 3-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135218161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Role-shifting threads: Increasing OpenMP malleability to address load imbalance at MPI and OpenMP 角色转换线程:增加OpenMP的延展性以解决MPI和OpenMP的负载不平衡问题

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-10-21 DOI: 10.1177/10943420231201153

Joel Criado, Victor Lopez, Joan Vinyals-Ylla-Catala, Guillem Ramirez-Miranda, Xavier Teruel, Marta Garcia-Gasulla

{"title":"Role-shifting threads: Increasing OpenMP malleability to address load imbalance at MPI and OpenMP","authors":"Joel Criado, Victor Lopez, Joan Vinyals-Ylla-Catala, Guillem Ramirez-Miranda, Xavier Teruel, Marta Garcia-Gasulla","doi":"10.1177/10943420231201153","DOIUrl":"https://doi.org/10.1177/10943420231201153","url":null,"abstract":"This paper presents the evolution of the free agent threads for OpenMP to the new role-shifting threads model and their integration with the Dynamic Load Balancing (DLB) library. We demonstrate how free agent threads can improve resource utilization in OpenMP applications with load imbalance in their nested parallel regions. We also demonstrate how DLB efficiently manages the malleability exposed by the role-shifting threads to address load imbalance issues. We use three real-world scientific applications, one of them to demonstrate that free agents alone can improve the OpenMP model without external tools, and two other MPI+OpenMP applications, one of them with a coupling case, to illustrate the potential of the free agent threads’ malleability with an external resource manager to increase the efficiency of the system. In addition, we demonstrate that the new implementation is more usable than the former one, letting the runtime system automatically make decisions that were made by the programmer previously. All software is released open-source.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"3 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135511599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient implementation of low-order-precision smoothed particle hydrodynamics 低阶精度平滑粒子流体力学的有效实现

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-09-14 DOI: 10.1177/10943420231201144

Natsuki Hosono, Mikito Furuichi

{"title":"Efficient implementation of low-order-precision smoothed particle hydrodynamics","authors":"Natsuki Hosono, Mikito Furuichi","doi":"10.1177/10943420231201144","DOIUrl":"https://doi.org/10.1177/10943420231201144","url":null,"abstract":"Smoothed particle hydrodynamics (SPH) method is widely accepted as a flexible numerical treatment for surface boundaries and interactions. High-resolution simulations of hydrodynamic events require high-performance computing (HPC). There is a need for an SPH code that runs efficiently on modern supercomputers involving accelerators such as NVIDIA or AMD graphics processing units. In this work, we applied half-precision, which is widely used in artificial intelligence, to the SPH method. However, improving HPC performance at such low-order precisions is a challenge. An as-is implementation with half-precision will have lower computational cost than that of float/double precision simulations, but also worsens the simulation accuracy. We propose a scaling and shifting method that maintains the simulation accuracy near the level of float/double precision. By examining the impact of half-precision on the simulation accuracy and time-to-solution, we demonstrated that the use of half-precision can improve the computational performance of SPH simulations for scientific purposes without sacrificing the accuracy. In addition, we demonstrated that the efficiency of half-precision depends on the architecture used.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134911412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications 使用OpenMP和CUDA/HIP的异构编程用于CPU-GPU混合科学应用

IF 3.1 3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-08-11 DOI: 10.1177/10943420231188079

Marc Gonzalez Tallada, E. Morancho

{"title":"Heterogeneous programming using OpenMP and CUDA/HIP for hybrid CPU-GPU scientific applications","authors":"Marc Gonzalez Tallada, E. Morancho","doi":"10.1177/10943420231188079","DOIUrl":"https://doi.org/10.1177/10943420231188079","url":null,"abstract":"Hybrid computer systems combine compute units (CUs) of different nature like CPUs, GPUs and FPGAs. Simultaneously exploiting the computing power of these CUs requires a careful decomposition of the applications into balanced parallel tasks according to both the performance of each CU type and the communication costs among them. This paper describes the design and implementation of runtime support for OpenMP hybrid GPU-CPU applications, when mixed with GPU-oriented programming models (e.g. CUDA/HIP). The paper describes the case for a hybrid multi-level parallelization of the NPB-MZ benchmark suite. The implementation exploits both coarse-grain and fine-grain parallelism, mapped to compute units of different nature (GPUs and CPUs). The paper describes the implementation of runtime support to bridge OpenMP and HIP, introducing the abstractions of Computing Unit and Data Placement. We compare hybrid and non-hybrid executions under state-of-the-art schedulers for OpenMP: static and dynamic task schedulings. Then, we improve the set of schedulers with two additional variants: a memorizing-dynamic task scheduling and a profile-based static task scheduling. On a computing node composed of one AMD EPYC 7742 @ 2.250 GHz (64 cores and 2 threads/core, totalling 128 threads per node) and 2 × GPU AMD Radeon Instinct MI50 with 32 GB, hybrid executions present speedups from 1.10× up to 3.5× with respect to a non-hybrid GPU implementation, depending on the number of activated CUs.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"37 1","pages":"626 - 646"},"PeriodicalIF":3.1,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49064728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Running ahead of evolution—AI-based simulation for predicting future high-risk SARS-CoV-2 variants 基于进化人工智能的模拟预测未来高风险的SARS-CoV-2变体

3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-07-29 DOI: 10.1177/10943420231188077

Jie Chen, Zhiwei Nie, Yu Wang, Kai Wang, Fan Xu, Zhiheng Hu, Bing Zheng, Zhennan Wang, Guoli Song, Jingyi Zhang, Jie Fu, Xiansong Huang, Zhongqi Wang, Zhixiang Ren, Qiankun Wang, Daixi Li, Dongqing Wei, Bin Zhou, Chao Yang, Yonghong Tian

{"title":"Running ahead of evolution—AI-based simulation for predicting future high-risk SARS-CoV-2 variants","authors":"Jie Chen, Zhiwei Nie, Yu Wang, Kai Wang, Fan Xu, Zhiheng Hu, Bing Zheng, Zhennan Wang, Guoli Song, Jingyi Zhang, Jie Fu, Xiansong Huang, Zhongqi Wang, Zhixiang Ren, Qiankun Wang, Daixi Li, Dongqing Wei, Bin Zhou, Chao Yang, Yonghong Tian","doi":"10.1177/10943420231188077","DOIUrl":"https://doi.org/10.1177/10943420231188077","url":null,"abstract":"The never-ending emergence of SARS-CoV-2 variations of concern (VOCs) has challenged the whole world for pandemic control. In order to develop effective drugs and vaccines, one needs to efficiently simulate SARS-CoV-2 spike receptor-binding domain (RBD) mutations and identify high-risk variants. We pretrain a large protein language model with approximately 408 million protein sequences and construct a high-throughput screening for the prediction of binding affinity and antibody escape. As the first work on SARS-CoV-2 RBD mutation simulation, we successfully identify mutations in the RBD regions of 5 VOCs and can screen millions of potential variants in seconds. Our workflow scales to 4096 NPUs with 96.5% scalability and 493.9× speedup in mixed-precision computing, while achieving a peak performance of 366.8 PFLOPS (reaching 34.9% theoretical peak) on Pengcheng Cloudbrain-II. Our method paves the way for simulating coronavirus evolution in order to prepare for a future pandemic that will inevitably take place. Our models are released at https://github.com/ZhiweiNiepku/SARS-CoV-2_mutation_simulation to facilitate future related work.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135444766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Guest editors note: Special issue on clusters, clouds, and data for scientific computing 客座编辑注:关于科学计算的集群、云和数据的特刊

IF 3.1 3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-07-01 DOI: 10.1177/10943420231180188

J. Dongarra, B. Tourancheau

{"title":"Guest editors note: Special issue on clusters, clouds, and data for scientific computing","authors":"J. Dongarra, B. Tourancheau","doi":"10.1177/10943420231180188","DOIUrl":"https://doi.org/10.1177/10943420231180188","url":null,"abstract":"The research areas of cluster, cloud, and data analytics computing, which today provide fundamental infrastructure for all areas of advanced computational science, are being radically transformed by the convergence of at least two unprecedented trends. The ﬁ rst is the ongoing emergence of multicore and hybrid microprocessor designs, ushering in a new era of computing in which system designers must accept energy usage as a ﬁ rst-order constraint, and application designers must be able to exploit parallelism and data locality to an unprecedented degree. As the research community is rapidly becoming aware, the components of the traditional HPC software stack are poorly matched to the characteristics of systems based on these new architectures — hundreds of thousands of nodes, millions of cores, GPU accelerators, reduced bandwidth, and memory per core. The second trend is the dramatic escalation in the amount of data that leading edge scienti ﬁ c applications, and the communities that use them, are either generating or trying to analyze. A key problem in such data intensive science lies not only in the shear volume of bits that must be processed and managed but also in the logistical problems associated with making the data of most current interest available to participants in large national and international collaborations, sitting in different administrative domains, spread across the wide area network, and wanting to use diverse resources — clusters, clouds, and data. This special issue gathers selected papers of the Work-shop on Clusters, Clouds and Data for Scienti ﬁ c Computing (CCDSC) that was held at La Maison des Contes , 69490 Dareize-France, on September 6 – 9, 2022. This workshop is a continuation of a series of workshops started in 1992 entitled Workshop on Environments and Tools for Parallel Scienti","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"37 1","pages":"211 - 212"},"PeriodicalIF":3.1,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46065430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Parallel multithreaded deduplication of data sequences in nuclear structure calculations 核结构计算中数据序列的并行多线程重复数据删除

IF 3.1 3区计算机科学

International Journal of High Performance Computing Applications Pub Date : 2023-06-30 DOI: 10.1177/10943420231183697

D. Langr, T. Dytrych

引用次数: 0