{"title":"Optimization of LAMMPS","authors":"J. Fischer, V. Natoli, D. Richie","doi":"10.1109/HPCMP-UGC.2006.56","DOIUrl":null,"url":null,"abstract":"The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) code is part of the Department of Defense High Performance Computing Modernization Program (HPCMP) technology insertion (TI) benchmarking suite of applications. As a component of the TI benchmarking applications, LAMMPS is a significant contributor to Computational Chemistry and Materials Science requirements within the DoD. Ensuring its optimal performance on HPCMP resources is a high priority. The CCM-KY5-003 User Productivity Enhancements and Technology Transfer (PET) project was created to profile and optimize LAMMPS to improve its performance and efficiency on two HPCMP assets, Eagle (Altix 3700) and JVN(Linux Cluster). Profiling efforts were completed on Eagle using the Tuning and Analysis Utilities (TAU) application in conjunction with the Performance Application Programming Interface (PAPI). The time and hardware counter data was analyzed with ParaProf and PerfExplorer, while the trace data was automatically analyzed by the Kit for Objective Judgment and Knowledge-based Detection of Performance Bottlenecks (KOJAK). The profiling effort used three different model systems, where each uses a different physical potential. Several opportunities for performance improvement were identified in the critical portions of the code, such as loop-static branching conditions, multi-dimensional C-arrays and redundant integer, and floating-point operations. In addition to algorithmic changes, compiler options and machine-specific instructions were also investigated. The performance of the code increased from 1.3x to 3.5x on Eagle and 1.2x to 1.8x on JVN depending on the physical potential employed","PeriodicalId":173959,"journal":{"name":"2006 HPCMP Users Group Conference (HPCMP-UGC'06)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 HPCMP Users Group Conference (HPCMP-UGC'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCMP-UGC.2006.56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) code is part of the Department of Defense High Performance Computing Modernization Program (HPCMP) technology insertion (TI) benchmarking suite of applications. As a component of the TI benchmarking applications, LAMMPS is a significant contributor to Computational Chemistry and Materials Science requirements within the DoD. Ensuring its optimal performance on HPCMP resources is a high priority. The CCM-KY5-003 User Productivity Enhancements and Technology Transfer (PET) project was created to profile and optimize LAMMPS to improve its performance and efficiency on two HPCMP assets, Eagle (Altix 3700) and JVN(Linux Cluster). Profiling efforts were completed on Eagle using the Tuning and Analysis Utilities (TAU) application in conjunction with the Performance Application Programming Interface (PAPI). The time and hardware counter data was analyzed with ParaProf and PerfExplorer, while the trace data was automatically analyzed by the Kit for Objective Judgment and Knowledge-based Detection of Performance Bottlenecks (KOJAK). The profiling effort used three different model systems, where each uses a different physical potential. Several opportunities for performance improvement were identified in the critical portions of the code, such as loop-static branching conditions, multi-dimensional C-arrays and redundant integer, and floating-point operations. In addition to algorithmic changes, compiler options and machine-specific instructions were also investigated. The performance of the code increased from 1.3x to 3.5x on Eagle and 1.2x to 1.8x on JVN depending on the physical potential employed
大规模原子/分子大规模并行模拟器(LAMMPS)代码是美国国防部高性能计算现代化计划(HPCMP)技术插入(TI)基准测试套件的一部分。作为TI基准测试应用程序的一个组成部分,LAMMPS是国防部计算化学和材料科学要求的重要贡献者。确保其在HPCMP资源上的最佳性能是一个高优先级。CCM-KY5-003用户生产力增强和技术转移(PET)项目的创建是为了分析和优化LAMMPS,以提高其在两个HPCMP资产Eagle (Altix 3700)和JVN(Linux集群)上的性能和效率。分析工作是在Eagle上使用调优和分析实用程序(TAU)应用程序以及性能应用程序编程接口(PAPI)完成的。使用parapprof和PerfExplorer分析时间和硬件计数器数据,使用KOJAK (Kit for Objective Judgment and Knowledge-based Detection of Performance瓶颈)自动分析跟踪数据。分析工作使用了三个不同的模型系统,其中每个系统使用不同的物理势。在代码的关键部分确定了几个性能改进的机会,例如循环静态分支条件、多维c数组和冗余整数以及浮点操作。除了算法变化之外,还研究了编译器选项和特定于机器的指令。代码的性能在Eagle上从1.3倍增加到3.5倍,在JVN上从1.2倍增加到1.8倍,这取决于所使用的物理潜能