2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)最新文献

Implementing and Parallelizing Real-time Lane Detection on Heterogeneous Platforms 异构平台上实时车道检测的实现与并行化

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445110

Xiebing Wang, C. Kiwus, Canhao Wu, Biao Hu, Kai Huang, A. Knoll

{"title":"Implementing and Parallelizing Real-time Lane Detection on Heterogeneous Platforms","authors":"Xiebing Wang, C. Kiwus, Canhao Wu, Biao Hu, Kai Huang, A. Knoll","doi":"10.1109/ASAP.2018.8445110","DOIUrl":"https://doi.org/10.1109/ASAP.2018.8445110","url":null,"abstract":"Lane detection is a cardinal functionality in state-of-the-art Advanced Driver Assistant Systems (ADAS). However, it is still not straightforward to fulfill the real-time performance demand of processing High Definition (HD) images with high robustness and scalability. To address this problem, we propose an improved lane detection algorithm based on top-view image transformation and two-stage RANdom SAmple Consensus (RANSAC) model fitting. By virtue of off-line affine homography matrix adaption to bound an adaptive Region Of Interest (ROI) for subsequent on-line Warp Perspective Mapping (WPM) transformation, the algorithm can analyze arbitrary on-road videos and generate adaptive ROI without priori knowledge about camera parameter. To ensure the scalability, we present a comprehensive parallel design of the application in a heterogeneous system consisting of multi-core CPU, GPU and FPGA. We show in detail how the potentially parallel task loads are implemented and optimized so that they can be mapped to the most suitable processor so as to achieve optimal performance. Experimental results reveal that our improved algorithm can robustly process the video streams with a higher accuracy. Moreover, the heterogeneous executions are capable of processing HD $mathbf{1920}times mathbf{1080}$ images with runtime performance of 81.6 fps and 47.9 fps, respectively, on an AMD FirePro W7100 GPU and a Terasic Arria 10 FPGA.","PeriodicalId":421577,"journal":{"name":"2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127519866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Adaptively Banded Smith-Waterman Algorithm for Long Reads and Its Hardware Accelerator 长读自适应带状Smith-Waterman算法及其硬件加速

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445105

Yi-Lun Liao, Yu-Cheng Li, Nae-Chyun Chen, Yi-Chang Lu

引用次数: 18

Fast Energy Estimation Through Partial Execution of HPC Applications 通过部分执行HPC应用的快速能量估计

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445089

Juan Carlos Salinas-Hilburg, Marina Zapater, Jose M. Moya, J. Ayala

{"title":"Fast Energy Estimation Through Partial Execution of HPC Applications","authors":"Juan Carlos Salinas-Hilburg, Marina Zapater, Jose M. Moya, J. Ayala","doi":"10.1109/ASAP.2018.8445089","DOIUrl":"https://doi.org/10.1109/ASAP.2018.8445089","url":null,"abstract":"In order to optimize the energy use of servers in Data Centers, techniques such as power capping or power budgeting are usually deployed. These techniques rely on the prediction of the power and execution time of applications. These data are obtained via dynamic profiling which requires a full execution of the application. This is not feasible in High Performance Computing (HPC) applications with long execution times. In this paper, we present a methodology to estimate the dynamic CPU and memory energy consumption of an application without executing it completely. Our methodology merges static code analysis information and dynamic profiling via the partial execution of the application. We do so by leveraging the concept of application signature, defined as a reduced version of the application in terms of execution time and power profile. We validate our methodology with a set of CPU -intensive, memory-intensive benchmarks and multi-threaded applications in a presently shipping enterprise server. Our energy estimation methodology shows an overall error below 8.0% when compared to the dynamic energy of the whole execution of the application. Also, our energy estimation methodology allows to estimate the energy of multi-threaded applications with an RMSE equal to 12.7% when compared to the dynamic energy from the complete parallel execution.","PeriodicalId":421577,"journal":{"name":"2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127045179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Unified Backend for Targeting FPGAs from DSLs 从dsl中定位fpga的统一后端

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445108

Emanuele Del Sozzo, Riyadh Baghdadi, Saman P. Amarasinghe, M. Santambrogio

{"title":"A Unified Backend for Targeting FPGAs from DSLs","authors":"Emanuele Del Sozzo, Riyadh Baghdadi, Saman P. Amarasinghe, M. Santambrogio","doi":"10.1109/ASAP.2018.8445108","DOIUrl":"https://doi.org/10.1109/ASAP.2018.8445108","url":null,"abstract":"The major flaw of Field Programmable Gate Arrays (FPGAs) is their hard programmability and steep learning curve. Even though High-Level Synthesis (HLS) tools may alleviate this task by providing directives to optimize the hardware design, as well as supporting languages like C/C++ and OpenCL, the development of efficient designs for FPGA is still a challenging and time-consuming task. In this context, Domain Specific Languages (DSLs) represent an emerging solution to generate efficient code to target FPGAs. However, the support for these languages towards FPGA is still limited, and only few DSLs provide FPGA backends. This paper describes FROST, a unified backend for targeting FPGAs from DSLs. FROST takes as input an algorithm described in one of the supported DSLs and generates an optimized design suitable for HLS tools. To this end, FROST exposes a high-level scheduling co-language to drive many aspects of the optimization process, like the resulting architecture, the level of parallelism, and so on. We evaluated FROST on a set of image processing kernels, developed in Halide and TIRAMISU, and compared the results against a hand-tuned FPGA library. The experimental results demonstrate that FROST designs are able to match the performance of such library (exploiting the same level of parallelism), and surpass it by a factor of 10X when combining FROST and the frontends scheduling commands.","PeriodicalId":421577,"journal":{"name":"2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121487959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

A Real-Time Learning-Based Super-Resolution System Using Direct Simple Functions 基于直接简单函数的实时学习超分辨率系统

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445121

Daolu Zha, Xi Jin, Rui Shang, Pengfei Yang

引用次数: 0

Real-Time High-Quality Stereo Matching System on a GPU 基于GPU的实时高质量立体匹配系统

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445111

Qiong Chang, T. Maruyama

引用次数: 5

A Soft Dual-Processor System with a Partially Run-Time Reconfigurable Shared 128-Bit SIMD Engine 具有部分运行时可重构共享128位SIMD引擎的软双处理器系统

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445115

J. Ordaz, Dirk Koch

引用次数: 6

Towards Hardware Accelerated Reinforcement Learning for Application-Specific Robotic Control 面向特定应用机器人控制的硬件加速强化学习

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445099

Shengjia Shao, Jason Tsai, Michal Mysior, W. Luk, T. Chau, Alexander Warren, B. Jeppesen

{"title":"Towards Hardware Accelerated Reinforcement Learning for Application-Specific Robotic Control","authors":"Shengjia Shao, Jason Tsai, Michal Mysior, W. Luk, T. Chau, Alexander Warren, B. Jeppesen","doi":"10.1109/ASAP.2018.8445099","DOIUrl":"https://doi.org/10.1109/ASAP.2018.8445099","url":null,"abstract":"Reinforcement Learning (RL) is an area of machine learning in which an agent interacts with the environment by making sequential decisions. The agent receives reward from the environment based on how good the decisions are and tries to find an optimal decision-making policy that maximises its longterm cumulative reward. This paper presents a novel approach which has showon promise in applying accelerated simulation of RL policy training to automating the control of a real robot arm for specific applications. The approach has two steps. First, design space exploration techniques are developed to enhance performance of an FPGA accelerator for RL policy training based on Trust Region Policy Optimisation (TRPO), which results in a 43% speed improvement over a previous FPGA implementation, while achieving 4.65 times speed up against deep learning libraries running on GPU and 19.29 times speed up against CPU. Second, the trained RL policy is transferred to a real robot arm. Our experiments show that the trained arm can successfully reach to and pick up predefined objects, demonstrating the feasibility of our approach.","PeriodicalId":421577,"journal":{"name":"2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124688687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Five-point algorithm: An efficient cloud-based FPGA implementation 五点算法:一个高效的基于云的FPGA实现

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445097

Marco Rabozzi, Emanuele Del Sozzo, Lorenzo Di Tucci, M. Santambrogio

引用次数: 0

A Reading Comprehension Style Question Answering Model Based On Attention Mechanism 基于注意机制的阅读理解式问答模型

2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Pub Date : 2018-07-01 DOI: 10.1109/ASAP.2018.8445117

Linlong Xiao, Nanzhi Wang, Guocai Yang

{"title":"A Reading Comprehension Style Question Answering Model Based On Attention Mechanism","authors":"Linlong Xiao, Nanzhi Wang, Guocai Yang","doi":"10.1109/ASAP.2018.8445117","DOIUrl":"https://doi.org/10.1109/ASAP.2018.8445117","url":null,"abstract":"In recent years, research on reading-compr question and answering has drawn intense attention in Language Processing. However, it is still a key issue to the high-level semantic vector representation of quest paragraph. Drawing inspiration from DrQA [1], wh question and answering system proposed by Facebook, tl proposes an attention-based question and answering 11 adds the binary representation of the paragraph, the par; attention to the question, and the question's attentioi paragraph. Meanwhile, a self-attention calculation m proposed to enhance the question semantic vector reption. Besides, it uses a multi-layer bidirectional Lon: Term Memory(BiLSTM) networks to calculate the h semantic vector representations of paragraphs and q Finally, bilinear functions are used to calculate the pr of the answer's position in the paragraph. The expe results on the Stanford Question Answering Dataset(SQl development set show that the F1 score is 80.1% and tl 71.4%, which demonstrates that the performance of the is better than that of the model of DrQA, since they inc 2% and 1.3% respectively.","PeriodicalId":421577,"journal":{"name":"2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129865378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11