{"title":"Implementation of FPGA-based Accelerator for Deep Neural Networks","authors":"T. Tsai, Yuan-Chen Ho, M. Sheu","doi":"10.1109/DDECS.2019.8724665","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724665","url":null,"abstract":"At present, there are many researches on deep neural network (DNN) applied in life. In the task of object recognition, deep convolutional neural network (CNN) has a good performance, but it relies on GPU to solve a large number of complex operations. Thus the hardware accelerator of DNN is concerned by many people. In order to implement the DNN model on hardware, complex connection relationship and memory usage scheduling are needed. This paper presnets the design of FPGA-based accelerator for DNN. The proposed architecture is implemented on Xilinx Zynq-7020 FPGA. It takes the advantages of low latency and low usage in the task of MNIST digital identification, and keeps the 96 % recognition rate.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123181553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digitalized-Management Voltage-Domain Programmable Mechanisms for Dual-Vdd Low-Power Embedded Digital Systems","authors":"Ching-Hwa Cheng, Tang-Chieh Liu","doi":"10.1109/DDECS.2019.8724664","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724664","url":null,"abstract":"A built-in digitalized power management (DPMM) and voltage domain programmable (VDP) mechanisms are proposed to design a low-power system. In the proposed techniques, the high and low voltages applied to logic modules can be switchable. This flexible voltage-domain assignment allows the chip performance and power consumption can dynamically adjust during circuit operation. To support the DPMM and VDP mechanisms, the voltage-level monitor circuit and power-switch circuit are designed to support multiple operation modes for DPMM-VDP digital circuit designs. A powerless retention flip-flop is developed for temporary data storage during voltage domain dynamically switching. While to prevent the system failure come from voltage integrity problem, a built-in voltage-level monitoring mechanism is utilized to monitor voltage integrity during VDP circuit operation. The proposed mechanism allows the chip performance and power consumption to be flexibly adjusted during circuit operation. The physical implementation chips and measured results proof of this methodology has $30 sim 55$% power reduction comparisons with using single-Vdd.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134397477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High side power MOSFET switch driver for a low-power AC/DC converter","authors":"M. Potocný, J. Brenkus, V. Stopjaková","doi":"10.1109/DDECS.2019.8724667","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724667","url":null,"abstract":"With the emergence of always-on wireless sensing nodes, AC/DC power conversion solutions for sub 1 W applications are required. Existing approaches are not efficient for such output loads, and therefore, new solutions need to be provided. In this paper, we propose a solution that is optimized for operation with output loads up to 500 mW, while high efficiency and close to zero no-load consumption have been our foremost design goals. The proposed design is implemented in a high-voltage CMOS process and transistor level simulation results show improved properties of the proposed solution over the existing ones.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127920266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hash-based Pattern Matching for High Speed Networks","authors":"Tomás Fukac, J. Korenek","doi":"10.1109/DDECS.2019.8724652","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724652","url":null,"abstract":"Regular expression matching is a complex task which is widely used in network security monitoring applications. With the growing speed of network links and the number of regular expressions, pattern matching architectures have to be improved to retain wire-speed processing. Multi-striding is a well-known technique to increase processing speed but it requires a lot of FPGA resources. Therefore, we focus on the design of new hardware architecture for fast pre-filtering of network traffic. The proposed pre-filter performs fast hash-based matching of short strings, which are specific for matched regular expressions. As the proposed pre-filter significantly reduces input traffic, exact pattern matching can operate on significantly lower speeds. Then the exact pattern match can be done by CPU or by a slow automaton with a few hardware resources. The paper provides analyses of false-positive detection of the pre-filter with respect to the length of matching strings. The number of false-positives is low, even if the length of the selected strings is short. Therefore input traffic can be significantly reduced. For 100 Gb links, the pre-filter reduced the input data to 1.83 Gbps using four-symbol strings.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121675702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Blank page]","authors":"","doi":"10.1109/ddecs.2019.8724666","DOIUrl":"https://doi.org/10.1109/ddecs.2019.8724666","url":null,"abstract":"","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114853060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 5 to 10.5 GHz Low-power Wideband I/Q Transmitter with Integrated Current-Mode Logic Frequency Divider","authors":"H. Chiou, Wei-Min Sung","doi":"10.1109/DDECS.2019.8724638","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724638","url":null,"abstract":"This paper presents a 5-10.5 GHz wideband fully-integrated I/Q transmitter in tsmc™ 90-nm CMOS technology. A current-mode passive mixer was adopted to enhance the linearity and low-power performance. The transmitter also integrated current-mode logic (CML) frequency divider (FD) to generate I/Q signals at LO-port. The I/Q signals were directly combined by using high-Q top-metal lines. An inductively coupled resonator (ICR) wideband output matching network was used to transform balance to unbalance output signal in RF drive-amplifier. The proposed transmitter achieves a 3-dB bandwidth from 5 to 10.5 GHz, a conversion gain of 12.9 dB, an output P1dB of -4.17 dBm, an output IP3 of 16.47 dBm, a carrier suppression of 30.02 dBc and a sideband suppression of 39.62 dBc under an LO power of 14 dBm at the center frequency of 8 GHz. The chip consumed dc power of 66.36 mW. The chip dimensions, including all RF and DC pads, are 1.25 × 1.1 mm2.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"3 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120920915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Sketch Classifier Technique with Deep Learning Models Realized in an Embedded System","authors":"T. Tsai, Po-Ting Chi, Kuo-Hsing Cheng","doi":"10.1109/DDECS.2019.8724656","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724656","url":null,"abstract":"Since 2011, due to the growth in the amount of information, the innovation of learning algorithms and the improvement of computer technology make the application of artificial intelligence feasible in a wide range of fields. This paper presents a sketch classifier technique with deep learning models. We use the depth-wise convolution layer to lighten the deep neural network. The result shows the improvement in approximately 1/5 of computation. We use Google Quick Draw dataset to train and evaluate the network, which can have 98% accuracy in 10 categories and 85% accuracy in 100 categories. Finally, we realize it on STM32F469I Discovery development board for demonstration. The system can achieve real-time implementation of sketch classification.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133779140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Acceleration of Feature Extraction for Real-Time Analysis of Encrypted Network Traffic","authors":"R. Vrána, J. Korenek, David Novak","doi":"10.1109/DDECS.2019.8724658","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724658","url":null,"abstract":"With the growing amount of encrypted network traffic, it is important to have tools for the analysis and classification of encrypted network data. Encrypted network traffic is usually analysed by statistical methods because Deep Packet Inspection or pattern matching is not applicable. However, the statistical methods are usually designed to work offline on already captured network traffic. For real-time analysis, hardware acceleration is needed to achieve wire-speed 10 Gbps throughput. Therefore, we focus on real-time monitoring of encrypted network traffic and propose a new acceleration method to extract features from encrypted network data. Approximate computing is used to speed up the computation of entropy for the input data stream and to reduce FPGA logic utilization. As can be seen in the results, the precision of classification has decreased only by 0.1 to 0.2. Moreover, proposed hardware architecture has very low FPGA logic utilization and can operate on high frequency.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126053788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Testability Measures Considering Circuit Reconvergence to Reduce ATPG Runtime","authors":"Kai-Hsun Chen, Ching-Yuan Chen, Jiun-Lang Huang","doi":"10.1109/DDECS.2019.8724660","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724660","url":null,"abstract":"Reconvergence has been recognized as the main reason for ATPG backtrack. It induces not only more, but also prolonged backtracks and causes more severe performance degradation than expected. In this paper, we propose a reconvergence-aware testability measure to better guide the ATPG justification process. Experiment results show that the proposed method significantly decreases the ATPG runtime, especially for circuits with deep logic level, by up to 76%.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130268894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hsiang-Chih Hsiao, Chun-Wei Chen, Jonas Wang, Ming-Der Shieh, Pei-Yin Chen
{"title":"Architecture-aware Memory Access Scheduling for High-throughput Cascaded Classifiers","authors":"Hsiang-Chih Hsiao, Chun-Wei Chen, Jonas Wang, Ming-Der Shieh, Pei-Yin Chen","doi":"10.1109/DDECS.2019.8724671","DOIUrl":"https://doi.org/10.1109/DDECS.2019.8724671","url":null,"abstract":"Cascaded classifier based object detectors are popular for many applications because of their high efficiency. Many researches have been devoted to developing the corresponding hardware accelerators. To reduce the circuit complexity while maintaining sufficient throughput, on-chip memories are commonly partitioned into several banks for parallel data access. However, since the coefficients of feature extraction are irregular, memory access conflict would frequently occur without proper scheduling. The proposed scheme explicitly schedules the access sequence as a post-processing for managing the coefficient memory. By formulating the desired sequence as a graph model, the classical graph coloring theory can then be adopted to solve the scheduling problem. In addition, the proposed graph model also considers the resource constraint on intermediate storage. Experimental results show that the throughput and area-efficiency of the target cascaded classifier can be greatly improved by adopting the proposed scheme as compared to the related work.","PeriodicalId":197053,"journal":{"name":"2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124330358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}