Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, S. Krishnamoorthy, P. Sadayappan
{"title":"CAST:对称张量的收缩算法","authors":"Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, S. Krishnamoorthy, P. Sadayappan","doi":"10.1109/ICPP.2014.35","DOIUrl":null,"url":null,"abstract":"Tensor contractions represent the most compute- intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions make them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution during contraction of symmetric tensors while also bypassing redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the coupled cluster singles and doubles (CCSD) quantum chemistry method. We also present a novel approach to tensor redistribution that can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"518 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"CAST: Contraction Algorithm for Symmetric Tensors\",\"authors\":\"Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, S. Krishnamoorthy, P. Sadayappan\",\"doi\":\"10.1109/ICPP.2014.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tensor contractions represent the most compute- intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions make them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution during contraction of symmetric tensors while also bypassing redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the coupled cluster singles and doubles (CCSD) quantum chemistry method. We also present a novel approach to tensor redistribution that can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.\",\"PeriodicalId\":441115,\"journal\":{\"name\":\"2014 43rd International Conference on Parallel Processing\",\"volume\":\"518 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 43rd International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2014.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 43rd International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2014.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Tensor contractions represent the most compute- intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions make them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution during contraction of symmetric tensors while also bypassing redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the coupled cluster singles and doubles (CCSD) quantum chemistry method. We also present a novel approach to tensor redistribution that can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.