2022 IEEE High Performance Extreme Computing Conference (HPEC)最新文献

筛选
英文 中文
A High-performance Deployment Framework for Pipelined CNN Accelerators with Flexible DSE Strategy 具有灵活DSE策略的流水线CNN加速器高性能部署框架
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926377
Conghui Luo, Wen-Liang Huang, Dehao Xiang, Yihua Huang
{"title":"A High-performance Deployment Framework for Pipelined CNN Accelerators with Flexible DSE Strategy","authors":"Conghui Luo, Wen-Liang Huang, Dehao Xiang, Yihua Huang","doi":"10.1109/HPEC55821.2022.9926377","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926377","url":null,"abstract":"The pipelined DCNN(Deep Convolutional Neural Networks) accelerator can effectively take advantage of the inter-layer parallelism, so it is widely used, e.g., video stream processing. But the large amount of intermediate results generated in the pipelined accelerator imposes a considerable burden on the on-chip storage resources on FPGAs. To ease the overburden storage demand, a storage-optimized design space exploration (DSE) method is proposed at the cost of a slight drop of computing resource utilization ratio. The experimental results show that the DSE strategy can achieve 98.49% and 98.00% CE (Computation Engines) utilization ratio on VGG16 and ResNet101, respectively. In addition, the resource optimization strategy can save 27.84% of BRAM resources on VGG 16, while the CE utilization ratio dropped by only 3.04%. An automated deployment framework that is adaptable to different networks with high computing resource utilization ratio is also proposed in this paper, which can achieve workload balancing automatically by optimizing the computing resource allocation of each layer.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125221490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Copyright and Reprint Permission 版权和转载许可
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/hpec55821.2022.9926356
{"title":"Copyright and Reprint Permission","authors":"","doi":"10.1109/hpec55821.2022.9926356","DOIUrl":"https://doi.org/10.1109/hpec55821.2022.9926356","url":null,"abstract":"","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127126348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Adaptation of Spiking Networks in a Gradual Changing Environment 渐变环境下尖峰网络的无监督自适应
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926367
Zaidao Mei, Mark D. Barnell, Qinru Qiu
{"title":"Unsupervised Adaptation of Spiking Networks in a Gradual Changing Environment","authors":"Zaidao Mei, Mark D. Barnell, Qinru Qiu","doi":"10.1109/HPEC55821.2022.9926367","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926367","url":null,"abstract":"Spiking neural networks(SNNs) have drawn broad research interests in recent years due to their high energy efficiency and biologically-plausibility. They have proven to be competitive in many machine learning tasks. Similar to all Artificial Neural Network(ANNs) machine learning models, the SNNs rely on the assumption that the training and testing data are drawn from the same distribution. As the environment changes gradually, the input distribution will shift over time, and the performance of SNNs turns out to be brittle. To this end, we propose a unified framework that can adapt non-stationary streaming data by exploiting unlabeled intermediate domain, and fits with the in-hardware SNN learning algorithm Error-modulated STDP. Specifically, we propose a unique self-training framework to generate pseudo labels to retrain the model for intermediate and target domains. In addition, we develop an online-normalization method with an auxiliary neuron to normalize the output of the hidden layers. By combining the normalization with self-training, our approach gains average classification improvements over 10% on MNIST, NMINST, and two other datasets.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127184611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible Hardware Accelerator Design Generation with Spiral 柔性硬件加速器设计与螺旋生成
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926413
Guanglin Xu, J. Hoe, F. Franchetti
{"title":"Flexible Hardware Accelerator Design Generation with Spiral","authors":"Guanglin Xu, J. Hoe, F. Franchetti","doi":"10.1109/HPEC55821.2022.9926413","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926413","url":null,"abstract":"Hardware specialization has become a widely employed technique for approaching higher performance and en-ergy efficiency in computer systems. Yet obtaining efficient cus-tom hardware designs remains a challenging and tedious task, calling for the automated approaches. In the past, Spiral has been used for generating high-throughput streaming hardware designs for linear transform kernels. This paper is motivated by an observation that a memory-based iterative computing model may allow us to trade off throughput for algorithmic flexibility. In this paper, we present a hardware generation approach that generates and optimizes algorithms using Spiral's multi-level domain-specific languages (DSLs), targeting a scalar load-store architecture. We have incorporated this approach as a hardware backend into the Spiral system. Our evaluation of this approach on several fundamental kernels shows flexibility with reasonable performance and resource utilization.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"43 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120894069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware Design and Implementation of Classic McEliece Post-Quantum Cryptosystem Based on FPGA 基于FPGA的经典mcelece后量子密码系统硬件设计与实现
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926295
Shaofen Chen, Hai-Tao Lin, Wen-Liang Huang, Yihua Huang
{"title":"Hardware Design and Implementation of Classic McEliece Post-Quantum Cryptosystem Based on FPGA","authors":"Shaofen Chen, Hai-Tao Lin, Wen-Liang Huang, Yihua Huang","doi":"10.1109/HPEC55821.2022.9926295","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926295","url":null,"abstract":"With the development of information age, the security of data transmission has attracted more attention. In addition, quantum computers pose a great threat to widely used cryptography algorithms. Therefore, Classic McEliece al-gorithm is a post-uantum algorithm, which has high security and stands firm in all kinds of attacks for decades. The wide application of the cryptosystem is inseparable from its hard-ware implementation scheme. So this paper proposes a Classic McEliece implementation scheme based on FPGA platform. To achieve the balance between resources and speed, a variety of methods to implement the scheme are adopted. First, using the characteristics of random access in the RAM, the clock cycle consumption of the error vector generating module is reduced by 95.1%. Second, multiple computing units are employed inside the module for parallel computing and which reduces the number of computing cycles by about 22.4%. Finally, this thesis proposes a multiplexing syndrome decoding module, and compared to the non-multiplexing scheme, the LUT resource consumption of this thesis is reduced by about 24.2%, and the FF resource consumption of this thesis is reduced by about 15.4%.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114798524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-Time Software Architecture for EM-Based Radar Signal Processing and Tracking 基于电磁的雷达信号处理与跟踪实时软件体系结构
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926338
Alan Nussbaum, B. Keel, W. Blair, U. Ramachandran
{"title":"Real-Time Software Architecture for EM-Based Radar Signal Processing and Tracking","authors":"Alan Nussbaum, B. Keel, W. Blair, U. Ramachandran","doi":"10.1109/HPEC55821.2022.9926338","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926338","url":null,"abstract":"While a radar tracks the kinematic state (position, velocity, and acceleration) of the target, an optimal signal processing requires knowledge of the target's range rate and radial acceleration that are derived from the tracking function in real time. High precision tracks are achieved through precise range and angle measurements whose precision are determined by the signal-to-noise ratio (SNR) of the received signal. The SNR is maximized by minimizing the matched filter loss due to uncertainties in the radial velocity and acceleration of the target. In this paper, the Expectation-Maximization (EM) algorithm is proposed as an iterative signal processing scheme for maximizing the SNR by executing enhanced range walk compensation i.e., correction for errors in the radial velocity and acceleration) in the real-time control loop software architecture. Maintaining a stringent timeline and adhering to latency requirements are essential for real-time sensor signal processing. This research aims to examine existing methods and explore new approaches and technologies to mitigate the harmful effects of range walk in tracking radar systems with an EM-Based iterative algorithm and implement the new control loop steering methods in a real-time computing environment.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114840545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling Novel In-Memory Computation Algorithms to Address Next-Generation Throughput Constraints on SWaP- Limited Platforms 启用新的内存计算算法来解决SWaP限制平台上的下一代吞吐量限制
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926297
Jessica Ray, C. Meiners
{"title":"Enabling Novel In-Memory Computation Algorithms to Address Next-Generation Throughput Constraints on SWaP- Limited Platforms","authors":"Jessica Ray, C. Meiners","doi":"10.1109/HPEC55821.2022.9926297","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926297","url":null,"abstract":"The Department of Defense relies heavily on filtering and selection applications to help manage the overwhelming amount of data constantly received at the tactical edge. Filtering and selection are both latency and throughput constrained, and systems at the tactical edge must heavily optimize their SWaP (size, weight, and power) usage, which can reduce overall compu-tation and memory performance. In-memory computation (IMC) provides a promising solution to the latency and throughput issues, as it helps enable the efficient processing of data as it is received, helping eliminate the memory bottleneck imposed by traditional Von Neumann architectures. In this paper, we discuss a specific type of IMC accelerator known as a Content Addressable Memory (CAM), which effectively operates as a hardware-based associative array, allowing fast lookup and match operations. In particular, we consider ternary CAMs (TCAMs) and their use within string matching, which are an important component of many filtering and se-lection applications. Despite the benefits gained with TCAMs, designing applications that utilize them remains a difficult task. Straightforward questions, such as “how large should my TCAM be?” and “what is the expected throughput?” are difficult to answer due to the many factors that go into effectively mapping data into a TCAM. This work aims to help answer these types of questions with a new framework called Stardust-Chicken. Stardust-Chicken supports generating and simulating TCAMs, and implements state-of-the-art algorithms and data representations that can effectively map data into TCAMs. With Stardust-Chicken, users can explore the tradeoff space that comes with TCAMs and better understand how to utilize them in their applications.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133979966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proposed Empirical Assessment of Remote Workers' Cyberslacking and Computer Security Posture to Assess Organizational Cybersecurity Risks 远程工作者网络懈怠与计算机安全态势的实证评估以评估组织网络安全风险
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926394
Ariel Luna, Y. Levy, G. Simco, Wei Li
{"title":"Proposed Empirical Assessment of Remote Workers' Cyberslacking and Computer Security Posture to Assess Organizational Cybersecurity Risks","authors":"Ariel Luna, Y. Levy, G. Simco, Wei Li","doi":"10.1109/HPEC55821.2022.9926394","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926394","url":null,"abstract":"Cyberslacking is conducted by employees who are using their companies' equipment and network for personal purposes instead of working during work hours. Cyberslacking has a significant adverse effect on overall employee productivity., however, recently, due to COVID19 move to remote working also pose a cybersecurity risk to organizations networks and infrastructure. In this work-in-progress research study, we are developing, validating, and will empirically test a taxonomy to assess an organization's remote workers” risk level of cybersecurity threats. This study includes a three-phased developmental approach in developing the Remote Worker Cyberslacking Security Risk Taxonomy. In collaboration with cybersecurity Subject Matter Experts (SMEs) use the taxonomy to assess organization's remote workers” risk level of cybersecurity threats by using actual system indicators of productivity measures to estimate their cyberslacking along with assessing via organizational information the computer security posture of the remote device being used to access corporate resources. Anticipated results from 125 anonymous employees from one organization will then be assessed on the cybersecurity risk taxonomy where recommendation to the organization's cybersecurity leadership will be provided.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133309912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine learning for accurate and fast bandgap prediction of solid-state materials 用于精确和快速的固体材料带隙预测的机器学习
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926355
Shomik Verma, S. Kajale, Rafael Gómez-Bombarelli
{"title":"Machine learning for accurate and fast bandgap prediction of solid-state materials","authors":"Shomik Verma, S. Kajale, Rafael Gómez-Bombarelli","doi":"10.1109/HPEC55821.2022.9926355","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926355","url":null,"abstract":"Semi-Iocal DFT tends to vastly underestimate the bandgap of materials. Here we propose a machine learning calibration workflow to improve the accuracy of cheap DFT calculations. We first compile a dataset of 25k materials with PBE and HSE calculations completed. Using this dataset, we benchmark various machine learning architectures and features to determine which results in the highest accuracy. The best technique is able to improve the accuracy of PBE 10-fold. We then expand the generalizability of the model by utilizing active learning to intelligently sample chemical space. Because HSE data is not available for these new materials, we develop an optimized high-throughput parallelized workflow to calculate HSE bandgaps of lOk additional materials. We therefore develop a cheap, accurate, and generalized ML model for bandgap prediction.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124680732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPGA Acceleration of Fully Homomorphic Encryption over the Torus 环面上全同态加密的FPGA加速
2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926381
Tian Ye, R. Kannan, V. Prasanna
{"title":"FPGA Acceleration of Fully Homomorphic Encryption over the Torus","authors":"Tian Ye, R. Kannan, V. Prasanna","doi":"10.1109/HPEC55821.2022.9926381","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926381","url":null,"abstract":"Fully Homomorphic Encryption over the Torus (TFHE) is a promising approach for secure computing in cloud servers to perform computations directly on encrypted data. However, TFHE has much higher computation complexity than its unencrypted counterpart. In this work, we propose an FPGA accelerator for TFHE computations. We illustrate the effects of an optimization called bootstrapping key unrolling on the tradeoff between performance of bootstrapping and FPGA resource consumption. We customize the data layout of TFHE ciphertext to optimize data access and improve data reuse. We parameterize the FPGA design for TFHE bootstrapping, which can be configured to achieve high performance for different user-specified security requirements and given FPGA resources. We implement our design on a state-of-the-art FPGA and compare it with existing results on CPUs. Our implementation for TFHE bootstrapping achieves 216x improvement in throughput and 16.5x improvement in latency compared with the software baseline on a state-of-the-art CPU server.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128517903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信