{"title":"LAMMPS中Tersoff电位的现场可编程门阵列加速","authors":"Quan Deng, Qiang Liu","doi":"10.1002/eng2.12694","DOIUrl":null,"url":null,"abstract":"Abstract Molecular dynamics simulation is a common method to help humans understand the microscopic world. The traditional general‐purpose high‐performance computing platforms are hindered by low computational and power efficiency, constraining the practical application of large‐scale and long‐time many‐body molecular dynamics simulations. In order to address these problems, a novel molecular dynamics accelerator for the Tersoff potential is designed based on field‐programmable gate array (FPGA) platforms, which enables the acceleration of LAMMPS using FPGAs. Firstly, an on‐the‐fly method is proposed to build neighbor lists and reduce storage usage. Besides, multilevel parallelizations are implemented to enable the accelerator to be flexibly deployed on FPGAs of different scales and achieve good performance. Finally, mathematical models of the accelerator are built, and a method for using the models to determine the optimal‐performance parameters is proposed. Experimental results show that, when tested on the Xilinx Alveo U200, the proposed accelerator achieves a performance of 9.51 ns/day for the Tersoff simulation in a 55,296‐atom system, which is a 2.00 increase in performance when compared to Intel I7‐8700K and 1.70 to NVIDIA Tesla K40c under the same test case. In addition, in terms of computational efficiency and power efficiency, the proposed accelerator achieves improvements of 2.00 and 7.19 compared to Intel I7‐8700K, and 4.33 and 2.11 compared to NVIDIA Titan Xp, respectively.","PeriodicalId":11735,"journal":{"name":"Engineering Reports","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Field‐programmable gate array acceleration of the Tersoff potential in LAMMPS\",\"authors\":\"Quan Deng, Qiang Liu\",\"doi\":\"10.1002/eng2.12694\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Molecular dynamics simulation is a common method to help humans understand the microscopic world. The traditional general‐purpose high‐performance computing platforms are hindered by low computational and power efficiency, constraining the practical application of large‐scale and long‐time many‐body molecular dynamics simulations. In order to address these problems, a novel molecular dynamics accelerator for the Tersoff potential is designed based on field‐programmable gate array (FPGA) platforms, which enables the acceleration of LAMMPS using FPGAs. Firstly, an on‐the‐fly method is proposed to build neighbor lists and reduce storage usage. Besides, multilevel parallelizations are implemented to enable the accelerator to be flexibly deployed on FPGAs of different scales and achieve good performance. Finally, mathematical models of the accelerator are built, and a method for using the models to determine the optimal‐performance parameters is proposed. Experimental results show that, when tested on the Xilinx Alveo U200, the proposed accelerator achieves a performance of 9.51 ns/day for the Tersoff simulation in a 55,296‐atom system, which is a 2.00 increase in performance when compared to Intel I7‐8700K and 1.70 to NVIDIA Tesla K40c under the same test case. In addition, in terms of computational efficiency and power efficiency, the proposed accelerator achieves improvements of 2.00 and 7.19 compared to Intel I7‐8700K, and 4.33 and 2.11 compared to NVIDIA Titan Xp, respectively.\",\"PeriodicalId\":11735,\"journal\":{\"name\":\"Engineering Reports\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/eng2.12694\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/eng2.12694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
分子动力学模拟是帮助人类了解微观世界的常用方法。传统的通用高性能计算平台受到低计算效率和低功耗的制约,限制了大规模、长时间多体分子动力学模拟的实际应用。为了解决这些问题,基于现场可编程门阵列(FPGA)平台设计了一种新型的Tersoff势分子动力学加速器,该加速器可以使用FPGA加速LAMMPS。首先,提出了一种实时构建邻居列表的方法,减少了存储空间的使用。此外,为了使加速器能够灵活地部署在不同规模的fpga上,并获得良好的性能,还实现了多电平并行化。最后,建立了加速器的数学模型,并提出了一种利用模型确定最佳性能参数的方法。实验结果表明,在Xilinx Alveo U200上进行测试时,所提出的加速器在55,296原子系统中实现了9.51 ns/day的Tersoff模拟性能,与Intel I7‐8700K相比,在相同的测试用例下,性能提高了2.00,与NVIDIA Tesla K40c相比提高了1.70。此外,在计算效率和功耗效率方面,该加速器与Intel I7‐8700K相比分别提高了2.00和7.19,与NVIDIA Titan Xp相比分别提高了4.33和2.11。
Field‐programmable gate array acceleration of the Tersoff potential in LAMMPS
Abstract Molecular dynamics simulation is a common method to help humans understand the microscopic world. The traditional general‐purpose high‐performance computing platforms are hindered by low computational and power efficiency, constraining the practical application of large‐scale and long‐time many‐body molecular dynamics simulations. In order to address these problems, a novel molecular dynamics accelerator for the Tersoff potential is designed based on field‐programmable gate array (FPGA) platforms, which enables the acceleration of LAMMPS using FPGAs. Firstly, an on‐the‐fly method is proposed to build neighbor lists and reduce storage usage. Besides, multilevel parallelizations are implemented to enable the accelerator to be flexibly deployed on FPGAs of different scales and achieve good performance. Finally, mathematical models of the accelerator are built, and a method for using the models to determine the optimal‐performance parameters is proposed. Experimental results show that, when tested on the Xilinx Alveo U200, the proposed accelerator achieves a performance of 9.51 ns/day for the Tersoff simulation in a 55,296‐atom system, which is a 2.00 increase in performance when compared to Intel I7‐8700K and 1.70 to NVIDIA Tesla K40c under the same test case. In addition, in terms of computational efficiency and power efficiency, the proposed accelerator achieves improvements of 2.00 and 7.19 compared to Intel I7‐8700K, and 4.33 and 2.11 compared to NVIDIA Titan Xp, respectively.