{"title":"Implementation of McMurchie–Davidson Algorithm for Gaussian AO Integrals Suited for SIMD Processors","authors":"Andrey Asadchev, and , Edward F. Valeev*, ","doi":"10.1021/acs.jpca.5c04136","DOIUrl":null,"url":null,"abstract":"<p >We report an implementation of the McMurchie–Davidson evaluation scheme for 1- and 2-particle Gaussian AO integrals designed for processors with Single Instruction Multiple Data (SIMD) instruction sets. Like in our recent MD implementation for graphical processing units (GPUs) [<contrib-group><span>Asadchev, A.</span>; <span>Valeev, E. F.</span></contrib-group>. <cite><i>J. Chem. Phys.</i></cite> <span>2024</span>, <em>160</em>, <elocation-id>244109</elocation-id>.], variable-sized batches of shellsets of integrals are evaluated at a time. By optimizing for the floating point instruction throughput rather than minimizing the number of operations, this approach achieves up to 50% of the theoretical hardware peak FP64 performance for many common SIMD-equipped platforms (AVX2, AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-art one-shellset-at-a-time implementation of Obara–Saika-type schemes in <span>Libint</span> for a variety of primitive and contracted integrals. As with our previous work, we rely on the standard C++ programming language─such as the <span>std::simd</span> standard library feature to be included in the 2026 ISO C++ standard─without any explicit code generation to keep the code base small and portable. The implementation is part of the open source <span>LibintX</span> library freely available at https://github.com/ValeevGroup/libintx.</p>","PeriodicalId":59,"journal":{"name":"The Journal of Physical Chemistry A","volume":"129 42","pages":"9788–9797"},"PeriodicalIF":2.8000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acs.jpca.5c04136","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Physical Chemistry A","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jpca.5c04136","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
We report an implementation of the McMurchie–Davidson evaluation scheme for 1- and 2-particle Gaussian AO integrals designed for processors with Single Instruction Multiple Data (SIMD) instruction sets. Like in our recent MD implementation for graphical processing units (GPUs) [Asadchev, A.; Valeev, E. F.. J. Chem. Phys.2024, 160, 244109.], variable-sized batches of shellsets of integrals are evaluated at a time. By optimizing for the floating point instruction throughput rather than minimizing the number of operations, this approach achieves up to 50% of the theoretical hardware peak FP64 performance for many common SIMD-equipped platforms (AVX2, AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-art one-shellset-at-a-time implementation of Obara–Saika-type schemes in Libint for a variety of primitive and contracted integrals. As with our previous work, we rely on the standard C++ programming language─such as the std::simd standard library feature to be included in the 2026 ISO C++ standard─without any explicit code generation to keep the code base small and portable. The implementation is part of the open source LibintX library freely available at https://github.com/ValeevGroup/libintx.
期刊介绍:
The Journal of Physical Chemistry A is devoted to reporting new and original experimental and theoretical basic research of interest to physical chemists, biophysical chemists, and chemical physicists.