{"title":"为现代分布式内存张量软件生成耦合集群代码","authors":"Jan Brandejs, Johann Pototschnig, Trond Saue","doi":"arxiv-2409.06759","DOIUrl":null,"url":null,"abstract":"Scientific groups are struggling to adapt their codes to quickly-developing\nGPU-based HPC platforms. The domain of distributed coupled cluster (CC)\ncalculations is not an exception. Moreover, our applications to tiny QED\neffects require higher-order CC which include thousands of tensor contractions,\nwhich makes automatic treatment imperative. The challenge is to allow efficient\nimplementation by capturing key symmetries of the problem, while retaining the\nabstraction from the hardware. We present the tensor programming framework\ntenpi, which seeks to find this balance. It features a python library user\ninterface, global optimization of intermediates, a visualization module and\nFortran code generator that bridges the DIRAC package for relativistic\nmolecular calculations to tensor contraction libraries. tenpi brings\nhigher-order CC functionality to the massively parallel module of DIRAC. The\narchitecture and design decision schemes are accompanied by benchmarks and by\nfirst production calculations on Summit, Frontier and LUMI along with\nstate-of-the-art of tensor contraction software.","PeriodicalId":501304,"journal":{"name":"arXiv - PHYS - Chemical Physics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generating coupled cluster code for modern distributed memory tensor software\",\"authors\":\"Jan Brandejs, Johann Pototschnig, Trond Saue\",\"doi\":\"arxiv-2409.06759\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific groups are struggling to adapt their codes to quickly-developing\\nGPU-based HPC platforms. The domain of distributed coupled cluster (CC)\\ncalculations is not an exception. Moreover, our applications to tiny QED\\neffects require higher-order CC which include thousands of tensor contractions,\\nwhich makes automatic treatment imperative. The challenge is to allow efficient\\nimplementation by capturing key symmetries of the problem, while retaining the\\nabstraction from the hardware. We present the tensor programming framework\\ntenpi, which seeks to find this balance. It features a python library user\\ninterface, global optimization of intermediates, a visualization module and\\nFortran code generator that bridges the DIRAC package for relativistic\\nmolecular calculations to tensor contraction libraries. tenpi brings\\nhigher-order CC functionality to the massively parallel module of DIRAC. The\\narchitecture and design decision schemes are accompanied by benchmarks and by\\nfirst production calculations on Summit, Frontier and LUMI along with\\nstate-of-the-art of tensor contraction software.\",\"PeriodicalId\":501304,\"journal\":{\"name\":\"arXiv - PHYS - Chemical Physics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Chemical Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06759\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Chemical Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generating coupled cluster code for modern distributed memory tensor software
Scientific groups are struggling to adapt their codes to quickly-developing
GPU-based HPC platforms. The domain of distributed coupled cluster (CC)
calculations is not an exception. Moreover, our applications to tiny QED
effects require higher-order CC which include thousands of tensor contractions,
which makes automatic treatment imperative. The challenge is to allow efficient
implementation by capturing key symmetries of the problem, while retaining the
abstraction from the hardware. We present the tensor programming framework
tenpi, which seeks to find this balance. It features a python library user
interface, global optimization of intermediates, a visualization module and
Fortran code generator that bridges the DIRAC package for relativistic
molecular calculations to tensor contraction libraries. tenpi brings
higher-order CC functionality to the massively parallel module of DIRAC. The
architecture and design decision schemes are accompanied by benchmarks and by
first production calculations on Summit, Frontier and LUMI along with
state-of-the-art of tensor contraction software.