2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)最新文献_第6页

MINLP Based Power Optimization for Pipelined ADC 基于MINLP的流水线ADC功率优化

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.64

A. Purushothaman

引用次数: 9

The Impact of Heterogeneity on a Reconfigurable Multicore System 异构对可重构多核系统的影响

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.67

Rafael Fão de Moura, J. D. Souza, L. Carro, A. C. S. Beck, M. B. Rutzig

{"title":"The Impact of Heterogeneity on a Reconfigurable Multicore System","authors":"Rafael Fão de Moura, J. D. Souza, L. Carro, A. C. S. Beck, M. B. Rutzig","doi":"10.1109/ISVLSI.2016.67","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.67","url":null,"abstract":"Modern embedded system must efficiently exploit parallelism at thread-and instruction-level to achieve the best performance with the lowest energy consumption possible. While Multiprocessor System-on-Chip (MPSoCs) are a commonly used solution, they do not provide an effective environment for software production, as each processing element implements a different Instruction Set Architecture (ISA). On the other hand, processors such as the ARM big.LITTLE comprise multicores with different organizations and the same ISA. However, such cores are power consuming superscalar microarchitectures. Dynamic Reconfigurable Architectures (DRA) emerge as a solution to fill this gap. By taking advantage of its regular fabric, it is possible to develop a low-energy heterogeneous system by coupling to the cores DRAs with different processing capabilities and that implements the same ISA. In this work, we evaluate such system, varying both the size of the DRAs and the memory system involved. We show that, by tuning the latter, one can reach energy savings of up to 36%, while by using a fully heterogeneous system, saves of 28% in energy and losses of 7% in performance are observed when compared to its counterpart homogeneous version.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122884070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Approximate Adder with Hybrid Prediction and Error Compensation Technique 基于混合预测和误差补偿技术的近似加法器

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.16

Xinghua Yang, Yue Xing, F. Qiao, Qi Wei, Huazhong Yang

{"title":"Approximate Adder with Hybrid Prediction and Error Compensation Technique","authors":"Xinghua Yang, Yue Xing, F. Qiao, Qi Wei, Huazhong Yang","doi":"10.1109/ISVLSI.2016.16","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.16","url":null,"abstract":"This paper proposed an approximate adder to accelerate computation and reduce energy consumption for error-resilient applications with a moderate output quality losses. The computation acceleration comes from the predictionscheme for the adder circuit, where the critical path is divided into multiple short fragments and a paralleling addition progress is enabled. The energy consumption is reduced as the result of trimming the registers from the lower predictors of the design. Furthermore, a simple module for error compensation is inserted into the approximate part of the circuit to decrease the relative error with very little hardware cost. Being simulated with 65nm CMOS process, 2.82X speedups and 57.8% energy-efficiency improvements have been achieved compared with traditional adders. Compared with the currenthigh performance approximate adders, the proposed addershows 6.9% energy-savings with 2 orders of reduction inrelative error using random test data. At last, the proposedapproximate adder is adopted in DCT processing, where more than 10dB PSNR increase can be achieved, compared with the current counterpart designs.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115804958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

A Gracefully Degrading and Energy-Efficient Fault Tolerant NoC Using Spare Core 一种使用备用核心的优雅降级和节能容错NoC

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.80

B. N. K. Reddy, M. H. Vasantha, Kumar Y. B. Nithin

引用次数: 62

Mod (2P-1) Shuffle Memory-Access Instructions for FFTs on Vector SIMD DSPs 矢量SIMD dsp上fft的Mod (2P-1) Shuffle内存访问指令

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.71

Sheng Liu, Haiyan Chen, Jianghua Wan, Yaohua Wang

引用次数: 2

Dynamic Per-Warp Reconvergence Stack for Efficient Control Flow Handling in GPUs gpu中高效控制流处理的动态逐曲再收敛堆栈

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.35

Yaohua Wang, Xiaowen Chen, Dong Wang, Sheng Liu

{"title":"Dynamic Per-Warp Reconvergence Stack for Efficient Control Flow Handling in GPUs","authors":"Yaohua Wang, Xiaowen Chen, Dong Wang, Sheng Liu","doi":"10.1109/ISVLSI.2016.35","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.35","url":null,"abstract":"GPGPUs usually experience performance degradation when the control flow of threads diverges in a warp. Reconvergence stack based control flow handling scheme is widely adopted in GPU architectures. The depth of such stack is always set to a large number, so that there can be enough entries for warps experiencing nested branches. However, for warps experiencing simple branches or even no branches, those deep reconvergence stacks would stay idle, causing a serious waste of hardware resource. Moreover, with the development of GPU architectures, more and more warps will be deployed on a GPU stream processor core, such problem could be even more serious. To solve this problem, this paper propose a dynamic reconvergence stack structure, in which a stack pool is shared by all the warps, and dynamic stacks of different warps can be constructed according to the run-time requirement. This can satisfy the stack requirement while eliminating unnecessary waste of hardware resource. Our experiments show that the dynamic reconvergence stack can reduce the cost of stack by 50% with the conventional performance well maintained.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131493188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Design Optimization of Register File Throughput and Energy Using a Virtual Prototyping (ViPro) Tool 使用虚拟样机(ViPro)工具设计寄存器文件吞吐量和能量的优化

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.50

Ningxi Liu, B. Calhoun

{"title":"Design Optimization of Register File Throughput and Energy Using a Virtual Prototyping (ViPro) Tool","authors":"Ningxi Liu, B. Calhoun","doi":"10.1109/ISVLSI.2016.50","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.50","url":null,"abstract":"Register files (RFs) consume significant power in low-power processors, and their specifications vary substantially for different applications. Challenges exist in identifying the appropriate RF design and optimizing RFs for different specifications. This paper not only explores methodologies of designing low power and high performance RFs and it also extends a virtual prototyping (ViPro) tool to support fast and efficient estimation of different design knobs on the overall multi-port RF macros. To enable aggressive exploration for RFs design, three bitline (BL) sensing schemes are included into ViPro along with parasitic parameters extracted from layout. Accuracy of ViPro results are within 15 % compared to full RF schematic SPICE simulation, while the simulation speed of ViPro is 5-10 times faster. An example reveals how ViPro can optimize RF design based on various specifications in a 45nm CMOS technology. Improvements of data throughput for 1R/1W port RFs are 31% and 72% at 0.5KB and 512KB, respectively, with proper BL sensing techniques. Results also show that the optimal BL sensing scheme changes with memory capacity. At 0.5KB, the lowest energy per operation decreases by 7.5% with a single-ended BL, while energy reduction is 45% with a hierarchical BL for 512KB.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Accurate All CMOS Temperature Sensor for IoT Applications 用于物联网应用的精确全CMOS温度传感器

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.113

Sunil Kumar Maddikatla, S. Jandhyala

{"title":"An Accurate All CMOS Temperature Sensor for IoT Applications","authors":"Sunil Kumar Maddikatla, S. Jandhyala","doi":"10.1109/ISVLSI.2016.113","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.113","url":null,"abstract":"In this manuscript an area efficient, linear, robust CMOS integrated temperature sensor circuit has been proposed in multiple technology nodes using UMC RF process for IoT and low cost SoC applications. In UMC 180nm node the proposed temperature sensor has an accuracy of ±0.4°C over 3σ variation in process and ±10% variation in supply, in the temperature range -55°C to 125°C. In 65nm node it has an accuracy of ±0.6°C over 3σ variation in process and ±10% variation in supply, in the temperature range -55°C to 125°C. The proposed design achieves a highly linear, proportional to absolute temperature (PTAT) voltage at reduced process corner dependence, using a process invariant circuit in conjunction with a supply independent biasing circuit. The supply sensitivity of the output voltage is 1100 ppm/V and spread with process is limited to ±0.6°C at UMC 180nm and ±1.5°C at 65nm technology. The proposed sensor in UMC 180nm technology occupies an area of 0.002 mm<sup>2</sup> and consumes 108μW of power. The output voltage is 136mV at room temperature (27°C) in typical corner, with a slope of 0.650mV/°C. The temperature sensor is included in a micro gyroscope application and the effect of temperature on the angular frequency at zero bias is presented.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121528339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Workload-Aware Power Gating Design and Run-Time Management for Massively Parallel GPGPUs 大规模并行gpgpu的工作负载感知功率门控设计和运行时管理

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.60

K. Dev, S. Reda, Indrani Paul, Wei Huang, W. Burleson

引用次数: 8

SoC, NoC and Hierarchical Bus Implementations of Applications on FPGAs Using the FCUDA Flow 基于FCUDA流程的fpga应用的SoC, NoC和分层总线实现

2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2016-07-01 DOI: 10.1109/ISVLSI.2016.131

T. Nguyen, Yao Chen, K. Rupnow, S. Gurumani, Deming Chen

引用次数: 3