Introduction to Special Issue on FPGAs in Data Centers

ACM Transactions on Reconfigurable Technology and Systems (TRETS) Pub Date : 2022-01-31 DOI:10.1145/3493607

Ken Eguro, S. Neuendorffer, V. Prasanna, Hongbo Rong

{"title":"Introduction to Special Issue on FPGAs in Data Centers","authors":"Ken Eguro, S. Neuendorffer, V. Prasanna, Hongbo Rong","doi":"10.1145/3493607","DOIUrl":null,"url":null,"abstract":"Hardware accelerators have been used recently to augment the compute power of data centers to improve the performance of many applications, particularly to optimize latency sensitive applications. In fact, several commercial vendors offer FPGAs in their cloud platforms. This special issue of ACM Transactions on Reconfigurable Technology and Systems presents advanced research in using FPGAs in data centers. The articles present recent research in several topics including impact of terrestrial radiation; memory system optimization using FPGAs; use and management of network accessible FPGAs; virtualization and runtime resource management in using FPGAs; novel applications of FPGAs in data centers; FPGA IP cores for data center acceleration; latency, and performance tradeoffs in using FPGAs for acceleration; and communication optimization using FPGAs. In response to the call for papers, 21 papers were received. After a thorough review of these manuscripts following the ACM manuscript review guidelines, 13 papers were accepted. The papers are grouped into two issues. This issue includes 10 papers accepted for publication in this special issue. The article “Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through Automatic Partitioning” by Petrica et al. presents an automatic partitioning technique to maximize the performance and scalability of FPGA-based pipeline dataflow DNN inference accelerators on computing infrastructures consisting of multi-die, network-connected FPGAs. The article “xDNN: Inference for Deep Convolutional Neural Network” by D’Alberto et al. presents an end-to-end system for deep-learning inference based on a family of specialized hardware processors synthesized on FPGAs and Convolution Neural Networks. The article “Hardware Acceleration of High-Performance Computational Flow Dynamics Using High-Bandwidth Memory-enabled Field-programmable Gate Arrays” by Nane et al. studies the potential of using FPGAs in computational flow dynamics in the context of rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of high-bandwidth memories on board. The article “BurstZ+: Eliminating the Communication Bottleneck of Scientific Computing Accelerators via Accelerated Compression” by Jun et al. presents an accelerator platform that eliminates the communication bottleneck between PCIe-attached scientific computing accelerators and their host servers via hardware-optimized compression. The article “Request, Coalesce, Serve, and Forget: Miss-Optimized Memory Systems for Bandwidth-Bound Cache-unfriendly Applications on FPGAs” by Asiatici et al. presents an efficient on-chip memory system for applications such as graph analytics to minimize the number of pipeline stalls. The article “NASCENT2: Generic Near-storage Sort Accelerator for Data Analytics on SmartSSD” by Salamat et al. presents an efficient algorithm for sorting of database tables via partitioning the data into multiple smaller sort operations.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3493607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Hardware accelerators have been used recently to augment the compute power of data centers to improve the performance of many applications, particularly to optimize latency sensitive applications. In fact, several commercial vendors offer FPGAs in their cloud platforms. This special issue of ACM Transactions on Reconfigurable Technology and Systems presents advanced research in using FPGAs in data centers. The articles present recent research in several topics including impact of terrestrial radiation; memory system optimization using FPGAs; use and management of network accessible FPGAs; virtualization and runtime resource management in using FPGAs; novel applications of FPGAs in data centers; FPGA IP cores for data center acceleration; latency, and performance tradeoffs in using FPGAs for acceleration; and communication optimization using FPGAs. In response to the call for papers, 21 papers were received. After a thorough review of these manuscripts following the ACM manuscript review guidelines, 13 papers were accepted. The papers are grouped into two issues. This issue includes 10 papers accepted for publication in this special issue. The article “Elastic-DF: Scaling Performance of DNN Inference in FPGA Clouds through Automatic Partitioning” by Petrica et al. presents an automatic partitioning technique to maximize the performance and scalability of FPGA-based pipeline dataflow DNN inference accelerators on computing infrastructures consisting of multi-die, network-connected FPGAs. The article “xDNN: Inference for Deep Convolutional Neural Network” by D’Alberto et al. presents an end-to-end system for deep-learning inference based on a family of specialized hardware processors synthesized on FPGAs and Convolution Neural Networks. The article “Hardware Acceleration of High-Performance Computational Flow Dynamics Using High-Bandwidth Memory-enabled Field-programmable Gate Arrays” by Nane et al. studies the potential of using FPGAs in computational flow dynamics in the context of rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of high-bandwidth memories on board. The article “BurstZ+: Eliminating the Communication Bottleneck of Scientific Computing Accelerators via Accelerated Compression” by Jun et al. presents an accelerator platform that eliminates the communication bottleneck between PCIe-attached scientific computing accelerators and their host servers via hardware-optimized compression. The article “Request, Coalesce, Serve, and Forget: Miss-Optimized Memory Systems for Bandwidth-Bound Cache-unfriendly Applications on FPGAs” by Asiatici et al. presents an efficient on-chip memory system for applications such as graph analytics to minimize the number of pipeline stalls. The article “NASCENT2: Generic Near-storage Sort Accelerator for Data Analytics on SmartSSD” by Salamat et al. presents an efficient algorithm for sorting of database tables via partitioning the data into multiple smaller sort operations.

查看原文本刊更多论文

数据中心fpga专题介绍

硬件加速器最近被用于增强数据中心的计算能力，以提高许多应用程序的性能，特别是优化对延迟敏感的应用程序。事实上，一些商业供应商在他们的云平台上提供fpga。本期《ACM可重构技术与系统汇刊》特刊介绍了在数据中心中使用fpga的先进研究。文章介绍了几个主题的最新研究，包括地面辐射的影响;基于fpga的存储系统优化;网络可访问fpga的使用与管理;fpga的虚拟化和运行时资源管理fpga在数据中心中的新应用;用于数据中心加速的FPGA IP核使用fpga加速时的延迟和性能权衡;以及利用fpga进行通信优化。论文征集活动共收到21篇论文。在按照ACM手稿审查指南对这些手稿进行彻底审查后，13篇论文被接受。这些论文分为两期。本期特刊收录了10篇论文。Petrica等人的文章“Elastic-DF:通过自动分区扩展FPGA云中的DNN推理性能”提出了一种自动分区技术，以最大限度地提高基于FPGA的管道数据流DNN推理加速器在由多芯片、网络连接的FPGA组成的计算基础设施上的性能和可扩展性。D 'Alberto等人的文章“xDNN:深度卷积神经网络的推理”提出了一个端到端的深度学习推理系统，该系统基于fpga和卷积神经网络合成的一系列专用硬件处理器。Nane等人的文章“使用高带宽内存支持的现场可编程门阵列的高性能计算流动力学的硬件加速”研究了在可重构硬件快速发展的背景下，在计算流动力学中使用fpga的潜力，例如片上存储器大小的增加，逻辑单元数量的增加以及板上高带宽存储器的集成。Jun等人的文章“BurstZ+:通过加速压缩消除科学计算加速器的通信瓶颈”介绍了一个加速器平台，该平台通过硬件优化压缩消除了pcie附加科学计算加速器与其主机服务器之间的通信瓶颈。Asiatici等人的文章“请求、合并、服务和遗忘:fpga上带宽受限的缓存不友好应用的未优化内存系统”提出了一种高效的片上存储系统，用于图形分析等应用，以最大限度地减少管道停滞的数量。Salamat等人的文章“NASCENT2:用于SmartSSD数据分析的通用近存储排序加速器”提出了一种有效的算法，通过将数据划分为多个较小的排序操作来对数据库表进行排序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Reconfigurable Technology and Systems (TRETS)

自引率

0.00%

发文量