2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)最新文献_第6页

An Out-of-Order Load-Store Queue for Spatial Computing 空间计算中的乱序负载存储队列

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2017-04-01 DOI: 10.1145/3126525

Lana Josipović, P. Brisk, P. Ienne

引用次数: 31

Escher: A CNN Accelerator with Flexible Buffering to Minimize Off-Chip Transfer Escher:一个具有灵活缓冲的CNN加速器，以最大限度地减少片外传输

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2017-04-01 DOI: 10.1109/FCCM.2017.47

Yongming Shen, M. Ferdman, Peter Milder

{"title":"Escher: A CNN Accelerator with Flexible Buffering to Minimize Off-Chip Transfer","authors":"Yongming Shen, M. Ferdman, Peter Milder","doi":"10.1109/FCCM.2017.47","DOIUrl":"https://doi.org/10.1109/FCCM.2017.47","url":null,"abstract":"Convolutional neural networks (CNNs) are used to solve many challenging machine learning problems. Interest in CNNs has led to the design of CNN accelerators to improve CNN evaluation throughput and efficiency. Importantly, the bandwidth demand from weight data transfer for modern large CNNs causes CNN accelerators to be severely bandwidth bottlenecked, prompting the need for processing images in batches to increase weight reuse. However, existing CNN accelerator designs limit the choice of batch sizes and lack support for batch processing of convolutional layers. We observe that, for a given storage budget, choosing the best batch size requires balancing the input and weight transfer. We propose Escher, a CNN accelerator with a flexible data buffering scheme that ensures a balance between the input and weight transfer bandwidth, significantly reducing overall bandwidth requirements. For example, compared to the state-of-the-art CNN accelerator designs targeting a Virtex-7 690T FPGA, Escher reduces the accelerator peak bandwidth requirements by 2.4x across both fully-connected and convolutional layers on fixed-point AlexNet, and reduces convolutional layer bandwidth by up to 10.5x on fixed-point GoogleNet.","PeriodicalId":124631,"journal":{"name":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121232190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 95

High-Performance Hardware Merge Sorter 高性能硬件合并排序器

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2017-04-01 DOI: 10.1109/FCCM.2017.19

Susumu Mashimo, Thiem Van Chu, Kenji Kise

引用次数: 46

Bonded Force Computations on FPGAs 基于fpga的键合力计算

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2017-04-01 DOI: 10.1109/FCCM.2017.49

Qingqing Xiong, M. Herbordt

引用次数: 7

FPGA-Based Real-Time Charged Particle Trajectory Reconstruction at the Large Hadron Collider 基于fpga的大型强子对撞机带电粒子轨迹实时重建

2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Pub Date : 2017-03-28 DOI: 10.1109/FCCM.2017.27

E. Bartz, J. Chaves, Y. Gershtein, E. Halkiadakis, M. Hildreth, S. Kyriacou, K. Lannon, A. Lefeld, A. Ryd, L. Skinnari, R. Stone, C. Strohman, Z. Tao, B. Winer, P. Wittich, Zhiru Zhang, M. Zientek

{"title":"FPGA-Based Real-Time Charged Particle Trajectory Reconstruction at the Large Hadron Collider","authors":"E. Bartz, J. Chaves, Y. Gershtein, E. Halkiadakis, M. Hildreth, S. Kyriacou, K. Lannon, A. Lefeld, A. Ryd, L. Skinnari, R. Stone, C. Strohman, Z. Tao, B. Winer, P. Wittich, Zhiru Zhang, M. Zientek","doi":"10.1109/FCCM.2017.27","DOIUrl":"https://doi.org/10.1109/FCCM.2017.27","url":null,"abstract":"The upgrades of the Compact Muon Solenoid particle physics experiment at CERN's Large Hadron Collider provide a major challenge for the real-time collision data selection. This paper presents a novel approach to pattern recognition and charged particle trajectory reconstruction using an all-FPGA solution. The challenges include a large input data rate of about 20 to 40~Tbps, processing a new batch of input data every 25~ns, each consisting of about 10,000 precise position measurements of particles ('stubs'), perform the pattern recognition on these stubs to find the trajectories, and produce the list of parameters describing these trajectories within 4~us. A proposed solution to this problem is described, in particular, the implementation of the pattern recognition and particle trajectory determination using an all-FPGA system. The results of an end-to-end demonstrator system based on Xilinx Virtex-7 FPGAs that meets timing and performance requirements are presented.","PeriodicalId":124631,"journal":{"name":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114227380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2