A. Gothandaraman, G. L. Warren, G. D. Peterson, R. Harrison
{"title":"n体系统中量子蒙特卡罗模拟的可重构加速器","authors":"A. Gothandaraman, G. L. Warren, G. D. Peterson, R. Harrison","doi":"10.1145/1188455.1188638","DOIUrl":null,"url":null,"abstract":"Recent advances in FPGA technology make them an attractive platform for accelerating scientific computing applications. We present a novel hardware accelerator for Quantum Monte Carlo simulations in N-body systems. The design is deeply pipelined and exploits the inherent fine-grained parallelism available using an FPGA for all calculations. The design is implemented on a Xilinx Virtex II Pro XC2VP30 device and preliminary results indicate a maximum operating frequency of 100MHz. A single instance of our design offers an estimated speedup of 20x and accuracy comparable to the serial code running on a 2.8GHz Intel Pentium 4 processor. This architecture performs all computations with fixed-point representation and delivers accuracy on the order of or better than double-precision floating point. After deploying a single instance on the present FPGA platform, targeting our design on the Cray XD1 platform with a high gate-density FPGA will allow us to operate multiple cores in parallel.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Reconfigurable accelerator for quantum Monte Carlo simulations in N-body systems\",\"authors\":\"A. Gothandaraman, G. L. Warren, G. D. Peterson, R. Harrison\",\"doi\":\"10.1145/1188455.1188638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advances in FPGA technology make them an attractive platform for accelerating scientific computing applications. We present a novel hardware accelerator for Quantum Monte Carlo simulations in N-body systems. The design is deeply pipelined and exploits the inherent fine-grained parallelism available using an FPGA for all calculations. The design is implemented on a Xilinx Virtex II Pro XC2VP30 device and preliminary results indicate a maximum operating frequency of 100MHz. A single instance of our design offers an estimated speedup of 20x and accuracy comparable to the serial code running on a 2.8GHz Intel Pentium 4 processor. This architecture performs all computations with fixed-point representation and delivers accuracy on the order of or better than double-precision floating point. After deploying a single instance on the present FPGA platform, targeting our design on the Cray XD1 platform with a high gate-density FPGA will allow us to operate multiple cores in parallel.\",\"PeriodicalId\":115940,\"journal\":{\"name\":\"Proceedings of the 2006 ACM/IEEE conference on Supercomputing\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2006 ACM/IEEE conference on Supercomputing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1188455.1188638\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1188455.1188638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
摘要
FPGA技术的最新进展使其成为加速科学计算应用的有吸引力的平台。提出了一种用于n体系统中量子蒙特卡罗模拟的新型硬件加速器。该设计是深度流水线的,并利用FPGA可用于所有计算的固有细粒度并行性。该设计在Xilinx Virtex II Pro XC2VP30设备上实现,初步结果表明最大工作频率为100MHz。我们设计的一个实例提供了大约20倍的加速和精度,可与2.8GHz英特尔奔腾4处理器上运行的串行代码相媲美。该体系结构使用定点表示执行所有计算,并提供与双精度浮点数相同或更好的精度。在现有的FPGA平台上部署单个实例后,针对我们在Cray XD1平台上的设计,使用高栅极密度FPGA将允许我们并行操作多个内核。
Reconfigurable accelerator for quantum Monte Carlo simulations in N-body systems
Recent advances in FPGA technology make them an attractive platform for accelerating scientific computing applications. We present a novel hardware accelerator for Quantum Monte Carlo simulations in N-body systems. The design is deeply pipelined and exploits the inherent fine-grained parallelism available using an FPGA for all calculations. The design is implemented on a Xilinx Virtex II Pro XC2VP30 device and preliminary results indicate a maximum operating frequency of 100MHz. A single instance of our design offers an estimated speedup of 20x and accuracy comparable to the serial code running on a 2.8GHz Intel Pentium 4 processor. This architecture performs all computations with fixed-point representation and delivers accuracy on the order of or better than double-precision floating point. After deploying a single instance on the present FPGA platform, targeting our design on the Cray XD1 platform with a high gate-density FPGA will allow us to operate multiple cores in parallel.