2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)最新文献_第3页

D-RECS: A complete methodology to implement Self Dynamic Reconfigurable FPGA-based systems D-RECS:实现基于fpga的自动态可重构系统的完整方法

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581550

F. Cancare, C. Pilato, Andrea Cazzaniga, D. Sciuto, M. Santambrogio

引用次数: 3

ACMA: Accuracy-configurable multiplier architecture for error-resilient System-on-Chip ACMA:用于抗错误的片上系统的可精确配置乘法器架构

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581532

Kartikeya Bhardwaj, P. Mane

{"title":"ACMA: Accuracy-configurable multiplier architecture for error-resilient System-on-Chip","authors":"Kartikeya Bhardwaj, P. Mane","doi":"10.1109/ReCoSoC.2013.6581532","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581532","url":null,"abstract":"In nanometer regime, optimization of System-on-Chip (SoC) designs w.r.t. speed, power and area is a major concern for VLSI designers today. Imprecise/approximate design obviates the constraints on accuracy, stemming a novel Speed-Power-Accuracy-Area (SPAA) metrics which can pilot to tremendous improvements in speed and/or power with a feeble accord in accuracy. This astonishingly expediency captivated researchers to delve into imprecise/approximate VLSI design evolution. In this paper, we present a new accuracy-configurable multiplier architecture (ACMA) for error-resilient systems. The ACMA uses a technique called Carry-in Prediction for approximate multiplication based on efficient precomputation logic that increases its throughput. The proposed multiplication reduces the latency of an accurate multiplier by almost half by reducing its critical path. The simulation results suggest that SPAA metrics can be administered by exploiting the design for apposite number of iterations. The results for 16-bit multiplication show the mean accuracy of 99.85% to 99.9% in case there is no lower bound on the size of operands and if size of operands are 10-bit or more (numbers > 1000), it results into a mean accuracy of 99.965%.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129035611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

CoEx: A novel profiling-based algorithm/architecture co-exploration for ASIP design CoEx:一种用于ASIP设计的基于分析的新型算法/架构协同探索

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581520

Juan Fernando Eusse Giraldo, Christopher Williams, R. Leupers

引用次数: 21

Approximation of hyperbolic tangent activation function using hybrid methods 双曲正切激活函数的混合逼近

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581545

M. Sartin, A. M. Silva

引用次数: 11

Memory allocation and optimization in system-level architectural synthesis 系统级体系结构综合中的内存分配和优化

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581537

Shuo Li, A. Hemani

{"title":"Memory allocation and optimization in system-level architectural synthesis","authors":"Shuo Li, A. Hemani","doi":"10.1109/ReCoSoC.2013.6581537","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581537","url":null,"abstract":"In this paper, we present a novel approach to optimally allocate memory resources in a system-level synthesis flow, which converts a dataflow style system description (synchronous data flow) into the register-transfer level description in the specified implementation style (ASIC, FPGA or CGRA). The first problem is encountered by the synthesis flow is that since it covers different implementation styles, a generic model is required to support resource allocation and optimization. The second problem is the memory allocation method to optimally allocate memory resources in the RTL model. The contribution of this paper has two parts, which are 1) a generic memory model for different memory architectures in ASIC, FPGA and CGRA, and 2) a memory allocation and optimization method for optimally allocating storage elements in the intermediate representation with actual implementations (e.g. on-chip SRAM for ASIC, memory controller and off-chip SDRAM for FPGA). The memory allocation method is an implementation style dependent procedure and has three steps: architecture independent optimization, resource allocation and architecture depended optimization. The experimental result shows that the proposed method is efficient and effective. The automatically generated implementation uses only approximately 4% more resources compared to manual implementation. The fast and automatic memory allocation method enables fast design space exploration that requires little effort form the system designer.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131023810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Register allocation for high-level synthesis of hardware accelerators targeting FPGAs 针对fpga的硬件加速器高级合成的寄存器分配

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581522

G. Hempel, Jan Hoyer, Thilo Pionteck, C. Hochberger

{"title":"Register allocation for high-level synthesis of hardware accelerators targeting FPGAs","authors":"G. Hempel, Jan Hoyer, Thilo Pionteck, C. Hochberger","doi":"10.1109/ReCoSoC.2013.6581522","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581522","url":null,"abstract":"This work evaluates the benefits of several register allocation strategies as part of a design flow for automatic generation of application-specific hardware accelerators targeting FPGAs. As usage of vendor-specific design tools is mandatory for system designs targeting FPGAs, high-level synthesis has to account for the optimization capabilities already implemented in these design tools. In addition, FPGA-specific hardware characteristics have to be considered as well. Therefore, several register allocation strategies are evaluated in the context of a GCC based C to HDL design flow for application-specific hardware accelerators. Evaluation was done by means of several example designs from typical application domains for embedded systems. These designs were synthesized using the ISE design suite with either area or speed as an optimization criteria. Synthesis results for Spartan 6 and Artix 7 FPGAs show that with regards to clock frequency and area requirements, register allocation strategy should be kept simple when generating HDL code as an input for FPGA vendor-specific design tools.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128503428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Improving parallel MPSoC simulation performance by exploiting dynamic routing delay prediction 利用动态路由延迟预测提高并行MPSoC仿真性能

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581524

Christoph Roth, Harald Bucher, Simon Reder, O. Sander, J. Becker

{"title":"Improving parallel MPSoC simulation performance by exploiting dynamic routing delay prediction","authors":"Christoph Roth, Harald Bucher, Simon Reder, O. Sander, J. Becker","doi":"10.1109/ReCoSoC.2013.6581524","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581524","url":null,"abstract":"Raising the abstraction level or parallel execution are two possible solutions in order to cope with extremely long runtimes of complex Multi-Processor System-on-Chip (MPSoC) simulations. Within previous works, a SystemC/TLM based modeling methodology targeting accurate simulation of NoC-based MPSoCs bas been proposed that benefits from both. Communication is abstracted into transactions. This enables extraction of parallelism through temporal decoupling for increasing efficiency of parallel simulation if a loss of accuracy is acceptable. This work extends previous works by a dynamic prediction mechanism that allows adapting the degree of temporal decoupling during runtime and thus prevents any loss of accuracy. The method is based on local time quanta that exist once for every module connection. Delay annotations within modules are exploited for predicting communication delays between modules. Based on these predictions, local time quanta are dynamically adjusted. The approach is evaluated by means of a realistic MPSoC model. Measurements have been performed on different host platforms. Results demonstrate that the method can significantly contribute to acceleration of parallel and sequential simulation.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127331883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

RecMIN: A reconfiguration architecture for network on chip RecMIN:一种片上网络重构体系结构

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581547

A. Logvinenko, Carsten Gremzow, D. Tutsch

引用次数: 5

Dynamically reconfigurable FIR filter architectures with fast reconfiguration 动态可重构FIR滤波器架构与快速重构

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581517

M. Kumm, Konrad Möller, P. Zipf

{"title":"Dynamically reconfigurable FIR filter architectures with fast reconfiguration","authors":"M. Kumm, Konrad Möller, P. Zipf","doi":"10.1109/ReCoSoC.2013.6581517","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581517","url":null,"abstract":"This work compares two finite impulse response (FIR) filter architectures for FPGAs for which the coefficients can be reconfigured during run-time. One is a recently proposed filter architecture based on distributed arithmetic (DA) and the other is based on a LUT multiplication scheme. Instead of using the common internal configuration access port (ICAP) for reconfiguration which is able to change the logic as well as the routing, it is sufficient to reconfigure only the logic in the regarded architectures. This is realized by using the configurable look-up table (CFGLUT) primitive of Xilinx that allows reconfiguration times which are orders of magnitudes faster than using ICAP. The resulting FIR filter architectures achieves reconfiguration times of typically less than 100 ns. They can be reconfigured with arbitrary coefficients that are only limited by their length and word size. As their resource consumptions depend on different parameters of the filter, a detailed comparison is done. It turned out that if the input word size is greater than approximately half the number of coefficients, the LUT based multiplication scheme needs less resources than the DA architecture and vice versa.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129931961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

An exploration of heterogeneous systems 异质系统的探索

2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) Pub Date : 2013-07-10 DOI: 10.1109/ReCoSoC.2013.6581542

Jesús Carabaño, Francisco Dios, M. Daneshtalab, M. Ebrahimi

{"title":"An exploration of heterogeneous systems","authors":"Jesús Carabaño, Francisco Dios, M. Daneshtalab, M. Ebrahimi","doi":"10.1109/ReCoSoC.2013.6581542","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2013.6581542","url":null,"abstract":"Heterogeneous computing represents a trendy way to achieve further scalability in the high-performance computing area. It aims to join different processing units in a networked-based system such that each task is preferably executed by the unit which is able to efficiently perform that task. Memory hierarchy, instruction set, control logic, and other properties may differ in processing units so as to be specialized for different variety of problems. However, it will be more time-consuming for computer engineers to understand, design, and program on these systems. On the other hand, proper problems running on well-chosen heterogeneous systems present higher performance and superior energy efficiency. Such balance of attributes seldom makes a heterogeneous system useful for other fields than embedded computing or high-performance computing. Among them, embedded computing is more area and energy efficient while high-performance computing obtains more performance. GPUs, FPGAs or the new Xeon Phi are example of common computational units that, along with CPUs, can compose heterogeneous systems aiming to accelerate the execution of programs. In this paper, we have explored these architectures in terms of energy efficiency, performance, and productivity.","PeriodicalId":354964,"journal":{"name":"2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122798501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10