2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)最新文献

筛选
英文 中文
Efficient FeFET Crossbar Accelerator for Binary Neural Networks 用于二元神经网络的高效FeFET交叉棒加速器
T. Soliman, R. Olivo, T. Kirchner, Cecilia De la Parra, M. Lederer, T. Kämpfe, A. Guntoro, N. Wehn
{"title":"Efficient FeFET Crossbar Accelerator for Binary Neural Networks","authors":"T. Soliman, R. Olivo, T. Kirchner, Cecilia De la Parra, M. Lederer, T. Kämpfe, A. Guntoro, N. Wehn","doi":"10.1109/ASAP49362.2020.00027","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00027","url":null,"abstract":"This paper presents a novel ferroelectric field-effect transistor (FeFET) in-memory computing architecture dedicated to accelerate Binary Neural Networks (BNNs). We present in-memory convolution, batch normalization and dense layer processing through a grid of small crossbars with reduced unit size, which enables multiple bit operation and value accumulation. Additionally, we explore the possible operations parallelization for maximized computational performance. Simulation results show that our new architecture achieves a computing performance up to 2.46 TOPS while achieving a high power efficiency reaching 111.8 TOPS/Watt and an area of 0.026 mm2 in 22nm FDSOI technology.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126348766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast and Accurate Training of Ensemble Models with FPGA-based Switch 基于fpga的开关集成模型快速准确训练
Jiuxi Meng, Ce Guo, Nadeen Gebara, W. Luk
{"title":"Fast and Accurate Training of Ensemble Models with FPGA-based Switch","authors":"Jiuxi Meng, Ce Guo, Nadeen Gebara, W. Luk","doi":"10.1109/ASAP49362.2020.00023","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00023","url":null,"abstract":"Random projection is gaining more attention in large scale machine learning. It has been proved to reduce the dimensionality of a set of data whilst approximately preserving the pairwise distance between points by multiplying the original dataset with a chosen matrix. However, projecting data to a lower dimension subspace typically reduces the training accuracy. In this paper, we propose a novel architecture that combines an FPGA-based switch with the ensemble learning method. This architecture enables reducing training time while maintaining high accuracy. Our initial result shows a speedup of 2.12-6.77 times using four different high dimensionality datasets.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124436953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Accelerating Radiative Transfer Simulation with GPU-FPGA Cooperative Computation 基于GPU-FPGA协同计算的加速辐射传递仿真
Ryohei Kobayashi, N. Fujita, Y. Yamaguchi, T. Boku, K. Yoshikawa, Makito Abe, M. Umemura
{"title":"Accelerating Radiative Transfer Simulation with GPU-FPGA Cooperative Computation","authors":"Ryohei Kobayashi, N. Fujita, Y. Yamaguchi, T. Boku, K. Yoshikawa, Makito Abe, M. Umemura","doi":"10.1109/ASAP49362.2020.00011","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00011","url":null,"abstract":"Field-programmable gate arrays (FPGAs) have garnered significant interest in research on high-performance computing. This is ascribed to the drastic improvement in their computational and communication capabilities in recent years owing to advances in semiconductor integration technologies that rely on Moore’s Law. In addition to these performance improvements, toolchains for the development of FPGAs in OpenCL have been offered by FPGA vendors to reduce the programming effort required. These improvements suggest the possibility of implementing the concept of enabling on-the-fly offloading computation at which CPUs/GPUs perform poorly relative to FPGAs while performing low-latency data transfers. We consider this concept to be of key importance to improve the performance of heterogeneous supercomputers that employ accelerators such as a GPU. In this study, we propose GPU–FPGA-accelerated simulation based on this concept and demonstrate the implementation of the proposed method with CUDA and OpenCL mixed programming. The experimental results showed that our proposed method can increase the performance by up to $17.4 times$ compared with GPU-based implementation. This performance is still $1.32 times$ higher even when solving problems with the largest size, which is the fastest problem size for GPU-based implementation. We consider the realization of GPU–FPGA-accelerated simulation to be the most significant difference between our work and previous studies.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132217653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improved Side-Channel Resistance by Dynamic Fault-Injection Countermeasures 动态故障注入对策提高侧通道电阻
Jan Richter-Brockmann, T. Güneysu
{"title":"Improved Side-Channel Resistance by Dynamic Fault-Injection Countermeasures","authors":"Jan Richter-Brockmann, T. Güneysu","doi":"10.1109/ASAP49362.2020.00029","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00029","url":null,"abstract":"Side-channel analysis and fault-injection attacks are known as serious threats to cryptographic hardware implementations and the combined protection against both is currently an open line of research. A promising countermeasure with considerable implementation overhead appears to be a mix of first-order secure Threshold Implementations and linear Error-Correcting Codes.In this paper we employ for the first time the inherent structure of non-systematic codes as fault countermeasure which dynamically mutates the applied generator matrices to achieve a higher-order side-channel and fault-protected design. As a case study, we apply our scheme to the PRESENT block cipher that do not show any higher-order side-channel leakage after measuring 150 million power traces.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121220213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ASAP 2020 Index
{"title":"ASAP 2020 Index","authors":"","doi":"10.1109/asap49362.2020.00043","DOIUrl":"https://doi.org/10.1109/asap49362.2020.00043","url":null,"abstract":"","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128162929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Parallel-friendly Majority Gate to Accelerate In-memory Computation 一个并行友好的多数门加速内存计算
J. Reuben, Stefan Pechmann
{"title":"A Parallel-friendly Majority Gate to Accelerate In-memory Computation","authors":"J. Reuben, Stefan Pechmann","doi":"10.1109/ASAP49362.2020.00025","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00025","url":null,"abstract":"Efforts to combat the ‘von Neumann bottleneck’ have been strengthened by Resistive RAMs (RRAMs), which enable computation in the memory array. Majority logic can accelerate computation when compared to NAND/NOR/IMPLY logic due to it’s expressive power. In this work, we propose a method to compute majority while reading from a transistor-accessed RRAM array. The proposed gate was verified by simulations using a physics-based model (for RRAM) and industry standard model (for CMOS sense amplifier) and, found to tolerate reasonable variations in the RRAMs’ resistive states. Together with NOT gate, which is also implemented in-memory, the proposed gate forms a functionally complete Boolean logic, capable of implementing any digital logic. Computing is simplified to a sequence of READ and WRITE operations and does not require any major modifications to the peripheral circuitry of the array. The parallel-friendly nature of the proposed gate is exploited to implement an eight-bit parallel-prefix adder in memory array. The proposed in-memory adder could achieve a latency reduction of 70% and 50% when compared to IMPLY and NAND/NOR logic-based adders, respectively.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"33 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124341728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
ParaHist: FPGA Implementation of Parallel Event-Based Histogram for Optical Flow Calculation 并行事件直方图光流计算的FPGA实现
Mohammad Pivezhandi, Phillip H. Jones, Joseph Zambreno
{"title":"ParaHist: FPGA Implementation of Parallel Event-Based Histogram for Optical Flow Calculation","authors":"Mohammad Pivezhandi, Phillip H. Jones, Joseph Zambreno","doi":"10.1109/ASAP49362.2020.00038","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00038","url":null,"abstract":"In this paper, we present an FPGA-based architecture for histogram generation to support event-based camera optical flow calculation. Our proposed histogram generation mechanism reduces memory and logic resources by storing the time difference between consecutive events, instead of the absolute time of each event. Additionally, we explore the trade-off between system resource usage and histogram accuracy as a function of the precision at which time is encoded. Our results show that across three event-based camera benchmarks we can reduce the encoding of time from 32 to 7 bits with a loss of only approximately 3% in histogram accuracy. In comparison to a software implementation, our architecture shows a significant speedup.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125103010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
[Copyright notice] (版权)
{"title":"[Copyright notice]","authors":"","doi":"10.1109/asap49362.2020.00003","DOIUrl":"https://doi.org/10.1109/asap49362.2020.00003","url":null,"abstract":"","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133933362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Hardware Approach to Self-Organizing Maps 自组织地图的一种新的硬件方法
L. Dias, M. G. Coutinho, E. Gaura, Marcelo A. C. Fernandes
{"title":"A New Hardware Approach to Self-Organizing Maps","authors":"L. Dias, M. G. Coutinho, E. Gaura, Marcelo A. C. Fernandes","doi":"10.1109/ASAP49362.2020.00041","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00041","url":null,"abstract":"Self-Organizing Maps (SOMs) are widely used as a data mining technique for applications that require data dimensionality reduction and clustering. Given the complexity of the SOM learning phase and the massive dimensionality of many data sets as well as their sample size in Big Data applications, high-speed processing is critical when implementing SOM approaches. This paper proposes a new hardware approach to SOM implementation, exploiting parallelization, to optimize the system’s processing time. Unlike most implementations in the literature, this proposed approach allows the parallelization of the data dimensions instead of the map, ensuring high processing speed regardless of data dimensions. An implementation with field-programmable gate arrays (FPGA) is presented and evaluated. Key evaluation metrics are processing time (or throughput) and FPGA area occupancy (or hardware resources).","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129415893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Template-based Framework for Exploring Coarse-Grained Reconfigurable Architectures 用于探索粗粒度可重构架构的基于模板的框架
Artur Podobas, K. Sano, S. Matsuoka
{"title":"A Template-based Framework for Exploring Coarse-Grained Reconfigurable Architectures","authors":"Artur Podobas, K. Sano, S. Matsuoka","doi":"10.1109/ASAP49362.2020.00010","DOIUrl":"https://doi.org/10.1109/ASAP49362.2020.00010","url":null,"abstract":"Coarse-Grained Reconfigurable Architectures (CGRAs) are being considered as a complementary addition to modern High-Performance Computing (HPC) systems. These reconfigurable devices overcome many of the limitations of the (more popular) FPGA, by providing higher operating frequency, denser compute capacity, and lower power consumption. Today, CGRAs have been used in several embedded applications, including automobile, telecommunication, and mobile systems, but the literature on CGRAs in HPC is sparse and the field full of research opportunities. In this work, we introduce our CGRA simulator infrastructure for use in evaluating future HPC CGRA systems. Our CGRA simulator is built on synthesizable VHDL and is highly parametrizable, including support for connectivity, SIMD, data-type width, and heterogeneity. Unlike other related work, our framework supports co-integration with third-party memory simulators or evaluation of future memory architecture, which is crucial to reason around memory-bound applications. We demonstrate how our framework can be used to explore the performance of multiple different kernels, showing the impact of different configuration and design-space options.","PeriodicalId":375691,"journal":{"name":"2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128212961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信