适用于SIMD处理器的高斯AO积分的mcmurkie - davidson算法的实现。

IF 2.8 2区化学 Q3 CHEMISTRY, PHYSICAL

The Journal of Physical Chemistry A Pub Date : 2025-10-13 DOI:10.1021/acs.jpca.5c04136

Andrey Asadchev, and , Edward F. Valeev*,

{"title":"适用于SIMD处理器的高斯AO积分的mcmurkie - davidson算法的实现。","authors":"Andrey Asadchev,  and , Edward F. Valeev*, ","doi":"10.1021/acs.jpca.5c04136","DOIUrl":null,"url":null,"abstract":"We report an implementation of the McMurchie–Davidson evaluation scheme for 1- and 2-particle Gaussian AO integrals designed for processors with Single Instruction Multiple Data (SIMD) instruction sets. Like in our recent MD implementation for graphical processing units (GPUs) [<contrib-group>Asadchev, A.; Valeev, E. F.</contrib-group>. <cite>J. Chem. Phys.</cite> 2024, 160, <elocation-id>244109</elocation-id>.], variable-sized batches of shellsets of integrals are evaluated at a time. By optimizing for the floating point instruction throughput rather than minimizing the number of operations, this approach achieves up to 50% of the theoretical hardware peak FP64 performance for many common SIMD-equipped platforms (AVX2, AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-art one-shellset-at-a-time implementation of Obara–Saika-type schemes in Libint for a variety of primitive and contracted integrals. As with our previous work, we rely on the standard C++ programming language─such as the std::simd standard library feature to be included in the 2026 ISO C++ standard─without any explicit code generation to keep the code base small and portable. The implementation is part of the open source LibintX library freely available at https://github.com/ValeevGroup/libintx.","PeriodicalId":59,"journal":{"name":"The Journal of Physical Chemistry A","volume":"129 42","pages":"9788–9797"},"PeriodicalIF":2.8000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acs.jpca.5c04136","citationCount":"0","resultStr":"{\"title\":\"Implementation of McMurchie–Davidson Algorithm for Gaussian AO Integrals Suited for SIMD Processors\",\"authors\":\"Andrey Asadchev,  and , Edward F. Valeev*, \",\"doi\":\"10.1021/acs.jpca.5c04136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We report an implementation of the McMurchie–Davidson evaluation scheme for 1- and 2-particle Gaussian AO integrals designed for processors with Single Instruction Multiple Data (SIMD) instruction sets. Like in our recent MD implementation for graphical processing units (GPUs) [<contrib-group>Asadchev, A.; Valeev, E. F.</contrib-group>. <cite>J. Chem. Phys.</cite> 2024, 160, <elocation-id>244109</elocation-id>.], variable-sized batches of shellsets of integrals are evaluated at a time. By optimizing for the floating point instruction throughput rather than minimizing the number of operations, this approach achieves up to 50% of the theoretical hardware peak FP64 performance for many common SIMD-equipped platforms (AVX2, AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-art one-shellset-at-a-time implementation of Obara–Saika-type schemes in Libint for a variety of primitive and contracted integrals. As with our previous work, we rely on the standard C++ programming language─such as the std::simd standard library feature to be included in the 2026 ISO C++ standard─without any explicit code generation to keep the code base small and portable. The implementation is part of the open source LibintX library freely available at https://github.com/ValeevGroup/libintx.\",\"PeriodicalId\":59,\"journal\":{\"name\":\"The Journal of Physical Chemistry A\",\"volume\":\"129 42\",\"pages\":\"9788–9797\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.acs.org/doi/pdf/10.1021/acs.jpca.5c04136\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Physical Chemistry A\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jpca.5c04136\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Physical Chemistry A","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jpca.5c04136","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

我们报告了针对具有单指令多数据（SIMD）指令集的处理器设计的1粒子和2粒子高斯AO积分的mcmurkie - davidson评估方案的实现。就像我们最近针对图形处理单元（gpu）的MD实现[Asadchev， A.；Valeev, e.f.……j .化学。物理学报，2024,160,244109。]，一次计算不同批次的shell集的积分。通过优化浮点指令吞吐量而不是最小化操作数量，这种方法在许多常见的simd装备平台（AVX2, AVX512， NEON）上实现了高达50%的理论硬件峰值FP64性能，这意味着在Libint中，对于各种原始积分和压缩积分，比最先进的一次一个shell集实现的obara - saika类型方案的速度提高了30倍。与我们以前的工作一样，我们依赖于标准的c++编程语言──例如2026 ISO c++标准中包含的std::simd标准库特性──没有任何显式的代码生成，以保持代码库的小而可移植。该实现是开源LibintX库的一部分，可在https://github.com/ValeevGroup/libintx免费获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Implementation of McMurchie–Davidson Algorithm for Gaussian AO Integrals Suited for SIMD Processors

We report an implementation of the McMurchie–Davidson evaluation scheme for 1- and 2-particle Gaussian AO integrals designed for processors with Single Instruction Multiple Data (SIMD) instruction sets. Like in our recent MD implementation for graphical processing units (GPUs) [Asadchev, A.; Valeev, E. F.. J. Chem. Phys. 2024, 160, 244109.], variable-sized batches of shellsets of integrals are evaluated at a time. By optimizing for the floating point instruction throughput rather than minimizing the number of operations, this approach achieves up to 50% of the theoretical hardware peak FP64 performance for many common SIMD-equipped platforms (AVX2, AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-art one-shellset-at-a-time implementation of Obara–Saika-type schemes in Libint for a variety of primitive and contracted integrals. As with our previous work, we rely on the standard C++ programming language─such as the std::simd standard library feature to be included in the 2026 ISO C++ standard─without any explicit code generation to keep the code base small and portable. The implementation is part of the open source LibintX library freely available at https://github.com/ValeevGroup/libintx.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The Journal of Physical Chemistry A 化学-物理：原子、分子和化学物理

CiteScore

5.20

自引率

10.30%

发文量

922

审稿时长

1.3 months

期刊介绍： The Journal of Physical Chemistry A is devoted to reporting new and original experimental and theoretical basic research of interest to physical chemists, biophysical chemists, and chemical physicists.