{"title":"A conceptual toolchain for an application domain specific reconfigurable logic architecture","authors":"T. Bostelmann, S. Sawitzki","doi":"10.1109/ReConFig.2014.7032487","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032487","url":null,"abstract":"In this paper we present a concept of a reconfigurable logic toolchain. The specialty of this toolchain is the highly configurable architecture design. The goal is to provide the designer with the ability to suit the architecture of the reconfigurable logic to a specific application domain, for example communication or image processing. Thereby the disadvantages of very flexible, universal structures like FPGAs (inefficient resource usage and high communication overhead) can be diminished. At the same time their advantages (short time to market and low non-recurring engineering costs) can be kept. To achieve this a graphical architecture editor allows the user to adapt the global design structure as well as the detailed implementation of the logic cells. An analysis tool reports how frequently the different logic and routing resources of the described architecture are utilized by a given set of applications, to allow an optimization towards a specific application domain. We show the degrees of freedom the envisioned toolchain offers and discuss the corresponding trade-offs. Furthermore we show which steps of the development toolchain have to be adapted to the needs of such a flexible architecture and how this can be done. Finally we present the status of this work in progress and give a prospect to the planned future work.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"44 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113999964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FPGA-based design and implementation of direct torque control for induction machines","authors":"M. Zare, R. Kavasseri, Cristinel Ababei","doi":"10.1109/ReConFig.2014.7032520","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032520","url":null,"abstract":"We present a field programmable gate array (FPGA) based implementation for direct torque control (DTC) of induction motor drives. The proposed design utilizes several improvements to execute the functional blocks in DTC that reduce the execution time and improve the sampling frequency. The FPGA system is implemented on a Xilinx Virtex-5 board using VHDL code assembled from scratch and the DSP based solution is implemented using dSPACE DS 1104. Both systems are validated experimentally with hardware-in-the-loop, on a small 200 W, 3 phase induction machine. Experimental results indicate that the proposed design enables a far higher sampling frequency (up to 800 kHz), compared to typical digital signal processors (DSP) based solutions which are limited to 20kHz. The higher sampling frequency helps mitigate torque ripple which is a well known limitation of DTC. Additionally, the short execution times suggest the possibility of extending the use of such FPGA implementations to serve auxiliary motor diagnostic functions.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124295747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-efficient histogram on FPGA","authors":"Andrea Sanny, Y. Yang, V. Prasanna","doi":"10.1109/ReConFig.2014.7032517","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032517","url":null,"abstract":"The construction of histograms is an integral part of image processing pipelines, useful for image editing features such as histogram matching, thresholding and histogram equalization. In the past, research done on kernels used in image processing pipelines target advancements to achieve high throughput, area efficiency and low cost. However, a growing topic of interest that has not been fully explored is the use of energy efficiency as a key metric. In this work, we focus on developing an energy-efficient histogram implementation with a minimum frame rate of at least 30 frames per second. We determine the components that consume the most power and propose an optimized histogram implementation with the utilization of multiple optimizations to achieve notable improvement in energy efficiency while maintaining suitable throughput for usage within image processing pipelines. These optimizations include a data-defined memory activation schedule, a careful data layout and circuit-level pipelining. Our architecture is implemented on commonly-used image sizes which vary from 240 × l28 to 1216×912 and assume a pixel width of 16 bits per pixel. The post place-and-route results show that our optimized architecture has up to 15.3× higher energy efficiency when compared against the baseline architecture.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121288063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-speed implementation of bcrypt password search using special-purpose hardware","authors":"Friedrich Wiemer, Ralf Zimmermann","doi":"10.1109/ReConFig.2014.7032529","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032529","url":null,"abstract":"Using passwords for user authentication is still the most common method for many internet services and attacks on the password databases pose a severe threat. To reduce this risk, servers store password hashes, which were generated using special password-hashing functions, to slow down guessing attacks. The most frequently used functions of this type are PBKDF2, bcrypt and scrypt. In this paper, we present a novel, flexible, high-speed implementation of a bcrypt password search system on a low-power Xilinx Zynq 7020 FPGA. The design consists of 40 parallel bcrypt cores running at 100 MHz. Our implementation outperforms all currently available implementations and improves password attacks on the same platform by at least 42%, computing 6,511 passwords per second for a cost parameter of 5.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128030837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Chen, Biao Hu, Kai Huang, A. Knoll, Kai Huang, Di Liu, T. Stefanov
{"title":"Automatic cache partitioning and time-triggered scheduling for real-time MPSoCs","authors":"Gang Chen, Biao Hu, Kai Huang, A. Knoll, Kai Huang, Di Liu, T. Stefanov","doi":"10.1109/ReConFig.2014.7032502","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032502","url":null,"abstract":"Shared cache in modern multi-core systems has been considered as one of the major factors that degrade system predictability and performance. How to manage the shared cache for real-time multi-core systems in order to optimize the system performance while guaranteeing the system predictability is still an open issue. In this paper, we present a framework that can exploit cache management for real-time MPSoCs. The framework supports dynamic way-based cache partitioning at hardware level, building task-level time-triggered reconfigurable-cache MPSoCs. It automatically determines time-triggered schedule and cache configuration for each task to improve the system performance while guarantee the realtime constraints. We evaluate the proposed framework with respect to different numbers of cores and cache modules and prototype the constructed MPSoCs on FPGA. Experiment results based on FPGA implementation demonstrate the effectiveness of the proposed framework over the state-of-the-art cache management strategies when tested 27 benchmark programs on the constructed MPSoCs.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132843568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alfredo Espinoza-Rhoton, L. F. Gonzalez-Perez, J. L. Ponce, B. Hector, Lennin C. Yllescas-Calderon, R. Parra-Michel, H. Aboushady
{"title":"An FPGA-based all-digital 802.11b & 802.15.4 receiver for the Software Defined Radio paradigm","authors":"Alfredo Espinoza-Rhoton, L. F. Gonzalez-Perez, J. L. Ponce, B. Hector, Lennin C. Yllescas-Calderon, R. Parra-Michel, H. Aboushady","doi":"10.1109/ReConFig.2014.7032499","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032499","url":null,"abstract":"An FPGA implementation of an all-digital fully compliant IEEE 802.11b and 802.15.4 configurable baseband receiver is presented. This architecture can be integrated in systems implementing the Software Defined Radio (SDR) paradigm, relaxing the need for high power consumption general purpose processors. The receiver uses a single architecture that can be configured for receiving either standard at run time, exploiting similarities between both protocols, and may serve as a coprocessor for offloading the task of processing baseband RF signals. The system can be used as a platform for future low power devices to integrate into the SDR paradigm. Results showed that the architecture exceeds the specifications required by both standards, and has great performance in low SNR scenarios, making it an attractive alternative in wireless sensor networks with extremely low signal power levels.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133580666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phenox: Zynq 7000 based quadcopter robot","authors":"Ryo Konomura, K. Hori","doi":"10.1109/ReConFig.2014.7032546","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032546","url":null,"abstract":"We describe our design of hardware and software systems for a quadcopter robot, which we name \"Phenox\". Phenox is a palm-sized quadcopter robot that can fly fully autonomously without any external controller or supporting systems. In our previous studies, we proposed palm-sized and fully autonomous quadcopter robots. However, in the previous systems, almost all the capability of the CPU and FPGA was used only for autonomous flight, so it was hard to implement additional user applications on the robot. In the design of the Phenox, we adopted the Zynq 7000 Soc, which has dual-core CPUs and an FPGA in one chip, enabling the users of the robot to implement their own application programs on the robot. We describe how Phenox processes images and sound, estimates its position, controls its flight, runs Linux OS and executes the user application programs in combination with open-source libraries such as OpenCV and Julius.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132125884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-performance FPGA implementations of volterra DFEs for optical fiber systems","authors":"A. Emeretlis, G. Theodoridis, G. Glentis","doi":"10.1109/ReConFig.2014.7032528","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032528","url":null,"abstract":"Low-complexity Volterra Decision Feedback Equalizers (VDFEs) for optical fiber links are proposed. By properly discarding large blocks of coefficients in the feedforward and feedback sections of the equalizer, a significant complexity reduction is achieved without affecting its efficiency. Moreover, suitable architectures and high-performance FPGA implementations are provided. It is demonstrated that the efficiency of the proposed VDFEs in terms of BER is similar to the counterpart full-size VDFEs for links of the same length, while, they demand 50% less arithmetic resources. Also, the proposed DFEs meet the desired 10 Gb/s rate and in certain cases achieve rates of 17 Gb/s and 25 Gb/s.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123783704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simon Schulz, O. Bringmann, Thomas Schweizer, W. Rosenstiel
{"title":"Rotated parallel mapping: A novel approach for mapping data parallel applications on CGRAs","authors":"Simon Schulz, O. Bringmann, Thomas Schweizer, W. Rosenstiel","doi":"10.1109/ReConFig.2014.7032554","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032554","url":null,"abstract":"In this paper we present a new way of mapping data-parallel applications on coarse-grained reconfigurable architectures (CGRAs) to increase their performance. Traditional mapping approaches aim to map an application to a minimum number of contexts. In this work we gave up this idea. We propose to use the temporal domain with multiple contexts, as the preferred mapping domain. The benefit of this approach is that enough free resources are made accessible for a parallel execution of a datapath, which enables a higher utilization of a CGRA's resources, and thus a performance increase can be achieved. To show the validity of the proposed method, the speedup of various applications is evaluated using both, theoretical and experimental studies. The results show a performance improvement of up to 122% when compared to traditional application mapping techniques.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125325465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware Task-Status Manager for an RTOS with FIFO communication","authors":"P. Zaykov, G. Kuzmanov, A. Molnos, K. Goossens","doi":"10.1109/ReConFig.2014.7032527","DOIUrl":"https://doi.org/10.1109/ReConFig.2014.7032527","url":null,"abstract":"In this paper, we address the problem of improving the performance of real-time embedded Multiprocessor System-on-Chip (MPSoC). Such MPSoCs often execute data-flow applications composed of multiple tasks, which communicate through First-In-First-Out (FIFO) queues. The tasks on each processor in the MPSoC are scheduled for execution by an instance of a Real-Time Operating System (RTOS). To improve performance, we propose a Hardware Task-Status Manager (HWTSM) block that reduces the Worst Case Execution Time (WCET) of the RTOS. The HWTSM is a Molen-style Custom Computing Unit (CCU), a coprocessor that determines the execution eligibility of tasks from FIFO-filling information. Furthermore, we propose a new processor-coprocessor execution model, denoted as parallel non-blocking. In this model the HWTSM execution overlaps with the execution of RTOS and user applications. The HWTSM is integrated into the existing CompSoC platform and this entire system is prototyped on a Xilinx XC5VFX130T FPGA chip. We experiment with two types of applications running in software, i.e., synthetic and real. With the synthetic applications, the results indicate a WCET reduction of the RTOS between 1.1 and 3.0 times. For each one of the real applications - JPEG and H.264 decoders, the experimental results indicate a WCET reduction of the RTOS by 1.3 and 1.6 times, respectively. The overall system performance gain vary from from 0.9% to 13.3% for synthetic applications, from 2.3% to 4.6% for the JPEG decoder, and from 3.8% to 7.5% for the H.264 decoder.","PeriodicalId":137331,"journal":{"name":"2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134016677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}