2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines最新文献

筛选
英文 中文
Bus-based MPSoC Security through Communication Protection: A Latency-efficient Alternative 通过通信保护的基于总线的MPSoC安全性:一种延迟高效的替代方案
Pascal Cotret, J. Crenne, G. Gogniat, J. Diguet
{"title":"Bus-based MPSoC Security through Communication Protection: A Latency-efficient Alternative","authors":"Pascal Cotret, J. Crenne, G. Gogniat, J. Diguet","doi":"10.1109/fccm.2012.42","DOIUrl":"https://doi.org/10.1109/fccm.2012.42","url":null,"abstract":"Security in MPSoC is gaining an increasing attention since several years. Digital convergence is one of the numerous reasons explaining such a focus on embedded systems as much sensitive and secret data are now stored, manipulated and exchanged in these systems. Most solutions are currently built at the software level, we believe hardware enhancements also play a major role in system protection. One strategic point is the communication layer as all data goes through it. Monitoring and controlling communications enable to fend off attacks before system corruption. In this work, we propose an efficient solution with several hardware enhancements to secure data exchanges in a bus-based MPSoC. Our approach relies on low complexity distributed firewalls connected to all critical IPs of the system. Designers can deploy different security policies (access right, data format, authentication, confidentiality) in order to protect the system in a flexible way. To illustrate the benefit of such a solution, implementations are discussed for different MPSoCs implemented on Xilinx Virtex-6 FPGAs. Results demonstrate a reduction up to 33% in terms of latency overhead compared to existing efforts.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126805374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
EBRAM - Extending the BlockRAMs in FPGAs to Support Caches and Hash Tables in an Efficient Manner EBRAM -在fpga中扩展blockram以有效地支持缓存和哈希表
A. Ehliar
{"title":"EBRAM - Extending the BlockRAMs in FPGAs to Support Caches and Hash Tables in an Efficient Manner","authors":"A. Ehliar","doi":"10.1109/FCCM.2012.52","DOIUrl":"https://doi.org/10.1109/FCCM.2012.52","url":null,"abstract":"In this paper we discuss how a typical Block RAM in an FPGA can be extended to enable the implementation of more efficient caches in FPGAs with very minor modifications to the existing Block RAM architectures. In addition, the modifications also allow other components, such as hash tables, to be implemented more efficiently.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"502 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132322405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Millions of Short Reads Mapping on a Heterogeneous Architecture with FPGA Accelerator 利用FPGA加速器加速异构架构下的百万短读映射
Wen Tang, Wendi Wang, Bo Duan, Chunming Zhang, Guangming Tan, Peiheng Zhang, Ninghui Sun
{"title":"Accelerating Millions of Short Reads Mapping on a Heterogeneous Architecture with FPGA Accelerator","authors":"Wen Tang, Wendi Wang, Bo Duan, Chunming Zhang, Guangming Tan, Peiheng Zhang, Ninghui Sun","doi":"10.1109/FCCM.2012.39","DOIUrl":"https://doi.org/10.1109/FCCM.2012.39","url":null,"abstract":"The explosion of Next Generation Sequencing (NGS) data with over one billion reads per day poses a great challenge to the capability of current computing systems. In this paper, we proposed a CPU-FPGA heterogeneous architecture for accelerating a short reads mapping algorithm, which was built upon the concept of hash-index. In particular, by extracting and mapping the most time-consuming and basic operations to specialized processing elements (PEs), our new algorithm is favorable to efficient acceleration on FPGAs. The proposed architecture is implemented and evaluated on a customized FPGA accelerator card with a Xilinx Virtex5 LX330 FPGA resided. Limited by available data transfer bandwidth, our NGS mapping accelerator, which operates at 175MHz, integrates up to 100 PEs. Compared to an Intel six-cores CPU, the speedup of our accelerator ranges from 22.2 times to 42.9 times.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115742491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
FLEXDET: Flexible, Efficient Multi-Mode MIMO Detection Using Reconfigurable ASIP FLEXDET:灵活,高效的多模MIMO检测使用可重构的ASIP
Xiaolin Chen, Andreas Minwegen, Yahia Hassan, D. Kammler, Shuai Li, T. Kempf, A. Chattopadhyay, G. Ascheid
{"title":"FLEXDET: Flexible, Efficient Multi-Mode MIMO Detection Using Reconfigurable ASIP","authors":"Xiaolin Chen, Andreas Minwegen, Yahia Hassan, D. Kammler, Shuai Li, T. Kempf, A. Chattopadhyay, G. Ascheid","doi":"10.1109/FCCM.2012.22","DOIUrl":"https://doi.org/10.1109/FCCM.2012.22","url":null,"abstract":"This paper describes the implementation of a multi-mode MIMO detector based on the concept of partially reconfigurable ASIP (rASIP). The multi-mode detector can support three different detection algorithms which are the Maximum Ratio Combining, the linear Minimum Mean Square Error (MMSE) detection, and the MMSE Successive Interference Cancellation. The detection algorithms also support different antenna configurations and modulation schemes. The rASIP is based on a Coarse-Grained Reconfigurable Architecture (CGRA), which is designed for efficient architectural support of matrix operations. A matrix inversion algorithm, which is used for the preprocessing of different detection algorithms, is mapped on the CGRA. By integrating a processor with the CGRA, the variations in the control path of different algorithm configurations can be handled efficiently. To the best of our knowledge, we show, for the first time that, a CGRA-based multi-mode MIMO detection is extremely efficient and matches the performance of dedicated ASIC implementation.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123854042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
An Extensible and Portable Tool Suite for Managing Multi-Node FPGA Systems 用于管理多节点FPGA系统的可扩展和便携式工具套件
Y. Rajasekhar, Rahul R. Sharma, R. Sass
{"title":"An Extensible and Portable Tool Suite for Managing Multi-Node FPGA Systems","authors":"Y. Rajasekhar, Rahul R. Sharma, R. Sass","doi":"10.1109/FCCM.2012.29","DOIUrl":"https://doi.org/10.1109/FCCM.2012.29","url":null,"abstract":"Current trends in Reconfigurable Computing point to an ever-increasing need for logic resources. This has led to the development and deployment of systems consisting of multiple FPGAs. These systems are advancing from scores of FPGAs to hundreds of FPGAs. Programming and administering these kind of systems while maintaining a non-intrusive physical footprint is expected to remain a continuing challenge. Future systems demand lower FPGA configuration and initialization turn-around times while maintaining a high degree of resource availability. Also, support for new configuration methodologies and portability to the latest generations of FPGA platforms is required. This paper describes an improved suite of tools with new features like job scheduling, bit stream management, multiple session control, debugging capabilities and remote access that will enable large scale High Performance Reconfigurable Computing.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122207571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On-the-fly Composition of FPGA-Based SQL Query Accelerators Using a Partially Reconfigurable Module Library 使用部分可重构模块库的基于fpga的SQL查询加速器的动态组合
C. Dennl, Daniel Ziener, J. Teich
{"title":"On-the-fly Composition of FPGA-Based SQL Query Accelerators Using a Partially Reconfigurable Module Library","authors":"C. Dennl, Daniel Ziener, J. Teich","doi":"10.1109/FCCM.2012.18","DOIUrl":"https://doi.org/10.1109/FCCM.2012.18","url":null,"abstract":"In this paper, we introduce a novel FPGA-based methodology for accelerating SQL queries using dynamic partial reconfiguration. Query acceleration is of utmost importance in large database systems to achieve a very high throughput. Although common FPGA-based accelerators are suitable to achieve such a high throughput, their design is hard to extend for new operations. Using partial dynamic reconfiguration, we are able to build more flexible architectures which can be extended to new operations or SQL constructs with a very low area overhead on the FPGA. Furthermore, the reconfiguration of a few FPGA frames can be used to switch very fast from one query to the next. In our approach, an SQL query is transformed into a hardware pipeline consisting of partially reconfigurable modules. The assembly of the (FPGA) data path is done at run-time using a static system providing the stream-based communication interfaces to the partial modules and the database management system. More specifically, each incoming SQL query is analyzed and divided into single operations which are subsequently mapped onto library modules and the composed data path loaded on the FPGA. We show that our approach is able to achieve a substantially higher throughput compared to a software-only solution.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123229849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 81
Implementing Murf: Accelerating Large State Space Exploration on FPGAs 实现Murf:在fpga上加速大状态空间探索
Ma Tie, M. Leeser
{"title":"Implementing Murf: Accelerating Large State Space Exploration on FPGAs","authors":"Ma Tie, M. Leeser","doi":"10.1109/FCCM.2012.53","DOIUrl":"https://doi.org/10.1109/FCCM.2012.53","url":null,"abstract":"PHAST, a Pipelined Hardware Accelerated STate Checker, achieves a 30x end-to-end speedup of a large state space exploration application in the form of an explicit state model checker. PHAST is a re-implementation, to accommodate FPGA hardware, of the Murphi verifier developed at Stanford University. Explicit state model checking explores a large state space and checks properties defined by the user. The FPGA infrastructure for PHAST can be reused for many different models and properties. Our model of the DASH protocol is similar in size and complexity to models Intel uses to validate proposed features of future processors: state sizes between 1200 and 1800 bits and a transition relation with more than 100 rules. Analysis of the DASH model as verified by PHAST indicates that the speedup will stay constant independent of the model being explored. The current implementation of PHAST, implemented on an Alpha-Data board with a Xilinx Virtex 5 and 1 GB of SDRAM, has the ability to explore up to 300,000 states in the DASH model. This model, with close to one hundred thousand states and 220 rules, takes up less than forty percent on the Virtex chip and less than thirty percent of the block RAMs. PHAST takes advantage of the flexible memory architecture and inherent concurrency provided by an FPGA to explore large state spaces. With access to more memory, PHAST could explore a much larger state space. This paper focuses on the generic structure developed for a hardware implementation of model checking as an example of accelerating large state space exploration. The main contributions in this paper lie in the hardware implementation specifics, including pipelining state generation to generate a new state every cycle and check that invariants, or safety properties, hold for all states; as well as efficiently implementing hash compaction and hash table lookups with a CAM for duplication detection and collision handling. The current implementation of PHAST uses a CAM to improve the generated number of states by over 20,000. Large state space exploration is an application area particularly well suited to FPGA acceleration. State space exploration applications developed on GPUs have good results or small states, but none have managed to exhibit both characteristics.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127994551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact of Cache Architecture and Interface on Performance and Area of FPGA-Based Processor/Parallel-Accelerator Systems 缓存结构和接口对fpga处理器/并行加速器系统性能和面积的影响
Jongsok Choi, Kevin Nam, Andrew Canis, J. Anderson, S. Brown, Tomasz S. Czajkowski
{"title":"Impact of Cache Architecture and Interface on Performance and Area of FPGA-Based Processor/Parallel-Accelerator Systems","authors":"Jongsok Choi, Kevin Nam, Andrew Canis, J. Anderson, S. Brown, Tomasz S. Czajkowski","doi":"10.1109/FCCM.2012.13","DOIUrl":"https://doi.org/10.1109/FCCM.2012.13","url":null,"abstract":"We describe new multi-ported cache designs suitable for use in FPGA-based processor/parallel-accelerator systems, and evaluate their impact on application performance and area. The baseline system comprises a MIPS soft processor and custom hardware accelerators with a shared memory architecture: on-FPGA L1 cache backed by off-chip DDR2 SDRAM. Within this general system model, we evaluate traditional cache design parameters (cache size, line size, associativity). In the parallel accelerator context, we examine the impact of the cache design and its interface. Specifically, we look at how the number of cache ports affects performance when multiple hardware accelerators operate (and access memory) in parallel, and evaluate two different hardware implementations of multi-ported caches using: 1) multi-pumping, and 2) a recently-published approach based on the concept of a live-value table. Results show that application performance depends strongly on the cache interface and architecture: for a system with 6 accelerators, depending on the cache design, speed up swings from 0.73× to 6.14×, on average, relative to a baseline sequential system (with a single accelerator and a direct-mapped, 2KB cache with 32B lines). Considering both performance and area, the best architecture is found to be a 4-port multi-pump direct-mapped cache with a 16KB cache size and a 128B line size.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128103232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Online Measurement of Timing in Circuits: For Health Monitoring and Dynamic Voltage & Frequency Scaling 电路定时的在线测量:用于健康监测和动态电压频率缩放
Joshua M. Levine, Edward A. Stott, G. Constantinides, P. Cheung
{"title":"Online Measurement of Timing in Circuits: For Health Monitoring and Dynamic Voltage & Frequency Scaling","authors":"Joshua M. Levine, Edward A. Stott, G. Constantinides, P. Cheung","doi":"10.1109/FCCM.2012.27","DOIUrl":"https://doi.org/10.1109/FCCM.2012.27","url":null,"abstract":"Reliability, power consumption and timing performance are key considerations for the utilisation of field-programmable gate arrays. Online measurement techniques can determine the timing characteristics of an FPGA application while it is operating, and facilitate a range of benefits. Degradation can be monitored by tracking changes in timing performance, while power consumption can be reduced through dynamic voltage scaling (DVS) of the power supply to exploit any spare timing headroom. If higher performance is the objective, dynamic frequency scaling (DFS) can be used to maximise operating frequency. In both cases, online timing measurement of the application circuit is used to exploit favourable operating conditions. This work demonstrates a method of online measurement, achieved by sweeping the phase of a secondary clock signal, driving additional shadowing registers strategically added to the application design. The measurement technique and initial voltage and frequency scaling experiments are demonstrated on an Alter a Cyclone III FPGA. Timing performance can be measured with a best case resolution of 96ps. The additional circuitry results in minimal overhead in terms of area and performance. Power savings of 23% dynamic and 13% static in an example circuit are achieved through DVS, or performance improvements of 21% through DFS, when compared with operating at nominal core voltage, or timing model FMax.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129666473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Multi-Resolution Real-Time Dense Stereo Vision Processing in FPGA 基于FPGA的多分辨率实时密集立体视觉处理
Eduardo Gudis, G. V. D. Wal, S. Kuthirummal, S. Chai
{"title":"Multi-Resolution Real-Time Dense Stereo Vision Processing in FPGA","authors":"Eduardo Gudis, G. V. D. Wal, S. Kuthirummal, S. Chai","doi":"10.1109/FCCM.2012.15","DOIUrl":"https://doi.org/10.1109/FCCM.2012.15","url":null,"abstract":"High-performance dense stereo is a critical component of computer vision applications like 3D reconstruction, robot navigation, and augmented reality. In this paper, we present a low-power, high performance FPGA implementation of a stereo algorithm suitable for embedded real-time platforms. The design is scalable for higher resolution images and frame rates and supporting different cameras and application requirements. We achieve this by designing highly parallel computation cores with very efficient memory access to the image data. Using a prototype board, we demonstrate real-time stereo processing with 640×480 pixel GigE Vision cameras at 30 frames per second. We show that this FPGA design is 10 times lower power, more scalable and has lower latency, as compared to a GPU based implementation of the same stereo algorithm.","PeriodicalId":226197,"journal":{"name":"2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129813121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信