Compiler Management of Communication and Parallelism for Quantum Computation

Jeff Heckey, S. Patil, Ali JavadiAbhari, Adam Holmes, Daniel Kudrow, K. Brown, D. Franklin, F. Chong, M. Martonosi
{"title":"Compiler Management of Communication and Parallelism for Quantum Computation","authors":"Jeff Heckey, S. Patil, Ali JavadiAbhari, Adam Holmes, Daniel Kudrow, K. Brown, D. Franklin, F. Chong, M. Martonosi","doi":"10.1145/2694344.2694357","DOIUrl":null,"url":null,"abstract":"Quantum computing (QC) offers huge promise to accelerate a range of computationally intensive benchmarks. Quantum computing is limited, however, by the challenges of decoherence: i.e., a quantum state can only be maintained for short windows of time before it decoheres. While quantum error correction codes can protect against decoherence, fast execution time is the best defense against decoherence, so efficient architectures and effective scheduling algorithms are necessary. This paper proposes the Multi-SIMD QC architecture and then proposes and evaluates effective schedulers to map benchmark descriptions onto Multi-SIMD architectures. The Multi-SIMD model consists of a small number of SIMD regions, each of which may support operations on up to thousands of qubits per cycle. Efficient Multi-SIMD operation requires efficient scheduling. This work develops schedulers to reduce communication requirements of qubits between operating regions, while also improving parallelism.We find that communication to global memory is a dominant cost in QC. We also note that many quantum benchmarks have long serial operation paths (although each operation may be data parallel). To exploit this characteristic, we introduce Longest-Path-First Scheduling (LPFS) which pins operations to SIMD regions to keep data in-place and reduce communication to memory. The use of small, local scratchpad memories also further reduces communication. Our results show a 3% to 308% improvement for LPFS over conventional scheduling algorithms, and an additional 3% to 64% improvement using scratchpad memories. Our work is the most comprehensive software-to-quantum toolflow published to date, with efficient and practical scheduling techniques that reduce communication and increase parallelism for full-scale quantum code executing up to a trillion quantum gate operations.","PeriodicalId":403247,"journal":{"name":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2694344.2694357","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 49

Abstract

Quantum computing (QC) offers huge promise to accelerate a range of computationally intensive benchmarks. Quantum computing is limited, however, by the challenges of decoherence: i.e., a quantum state can only be maintained for short windows of time before it decoheres. While quantum error correction codes can protect against decoherence, fast execution time is the best defense against decoherence, so efficient architectures and effective scheduling algorithms are necessary. This paper proposes the Multi-SIMD QC architecture and then proposes and evaluates effective schedulers to map benchmark descriptions onto Multi-SIMD architectures. The Multi-SIMD model consists of a small number of SIMD regions, each of which may support operations on up to thousands of qubits per cycle. Efficient Multi-SIMD operation requires efficient scheduling. This work develops schedulers to reduce communication requirements of qubits between operating regions, while also improving parallelism.We find that communication to global memory is a dominant cost in QC. We also note that many quantum benchmarks have long serial operation paths (although each operation may be data parallel). To exploit this characteristic, we introduce Longest-Path-First Scheduling (LPFS) which pins operations to SIMD regions to keep data in-place and reduce communication to memory. The use of small, local scratchpad memories also further reduces communication. Our results show a 3% to 308% improvement for LPFS over conventional scheduling algorithms, and an additional 3% to 64% improvement using scratchpad memories. Our work is the most comprehensive software-to-quantum toolflow published to date, with efficient and practical scheduling techniques that reduce communication and increase parallelism for full-scale quantum code executing up to a trillion quantum gate operations.
量子计算中通信和并行性的编译器管理
量子计算(QC)为加速一系列计算密集型基准测试提供了巨大的希望。然而,量子计算受到退相干挑战的限制:即,量子态只能在退相干之前保持很短的时间窗口。虽然量子纠错码可以防止退相干,但快速的执行时间是防止退相干的最佳防御,因此高效的架构和有效的调度算法是必要的。本文提出了Multi-SIMD QC体系结构,然后提出并评估了将基准描述映射到Multi-SIMD体系结构上的有效调度程序。Multi-SIMD模型由少量SIMD区域组成,每个区域每个周期可以支持多达数千个量子位的操作。高效的Multi-SIMD操作需要高效的调度。这项工作开发了调度器,以减少操作区域之间量子比特的通信需求,同时也提高了并行性。我们发现,在QC中,与全局内存的通信是一个主要的成本。我们还注意到,许多量子基准测试具有很长的串行操作路径(尽管每个操作可能是数据并行的)。为了利用这一特性,我们引入了最长路径优先调度(LPFS),该调度将操作固定到SIMD区域,以保留数据并减少对内存的通信。使用小型本地刮记存储器也进一步减少了通信。我们的结果表明,与传统调度算法相比,LPFS的性能提高了3%到308%,使用刮刮板存储器的性能提高了3%到64%。我们的工作是迄今为止发布的最全面的软件到量子工具流,具有高效实用的调度技术,可减少通信并增加执行多达一万亿量子门操作的全尺寸量子代码的并行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信