2015 28th International Conference on VLSI Design最新文献

筛选
英文 中文
Bandwidth Adaptive Nanophotonic Crossbars with Clockwise/Counter-clockwise Optical Routing 顺/逆时针光路由的带宽自适应纳米光子交叉棒
2015 28th International Conference on VLSI Design Pub Date : 2015-01-01 DOI: 10.1109/VLSID.2015.26
M. Kennedy, Avinash Karanth Kodi
{"title":"Bandwidth Adaptive Nanophotonic Crossbars with Clockwise/Counter-clockwise Optical Routing","authors":"M. Kennedy, Avinash Karanth Kodi","doi":"10.1109/VLSID.2015.26","DOIUrl":"https://doi.org/10.1109/VLSID.2015.26","url":null,"abstract":"Future processors are anticipated to have hundreds or even thousands of processing cores placed entirely on a single silicon chip. The increasing number of cores placed on a single chip presents new challenges, pushing researchers to explore opportunities in emerging technologies such as on-chip silicon nanophotonics. Implications of nanophotonic technology has created a unique landscape for new interconnect designs. Among the many architectures made possible by nanophotonics, there has been notable interest in crossbar topologies that were previously impractical using only electrical components. In this paper, we present a new nanophotonic crossbar interconnect architecture with the aim of retaining the low latency, single-hop characteristic of the crossbar topology, while also improving the networks utility of the static laser source which is often wasted to insertion losses and unused bandwidth. We compare our architecture design to other proposed architectures according to area, power consumption, throughput, and latency. Approximately a 13% improvement in throughput is achieved compared to other optical crossbar topologies and a 92% improvement is achieved compared to a conventional electrical flattened butterfly topology on synthetic traffic patterns.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126331675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
EvoDeb: Debugging Evolving Hardware Designs EvoDeb:调试不断发展的硬件设计
2015 28th International Conference on VLSI Design Pub Date : 2015-01-01 DOI: 10.1109/VLSID.2015.87
Debjyoti Bhattacharjee, A. Banerjee, A. Chattopadhyay
{"title":"EvoDeb: Debugging Evolving Hardware Designs","authors":"Debjyoti Bhattacharjee, A. Banerjee, A. Chattopadhyay","doi":"10.1109/VLSID.2015.87","DOIUrl":"https://doi.org/10.1109/VLSID.2015.87","url":null,"abstract":"Increasing design complexity, skyrocketing fabrication costs for modern digital systems coupled with an unacceptably large number of silicon respins led to growing importance of comprehensive and automated design verification. Akin to software configuration management, it is becoming commonplace to maintain large hardware design code-bases with hardware configuration management tools. A missing piece of crucial technology in this approach is to manage design verification across evolving hardware designs. In this paper, we propose an efficient methodology for automatically localizing design errors across design versions. The proposed technique, Evo Deb, can be easily integrated into a hardware configuration management framework and is scalable for large designs. We demonstrate the efficacy of Evo Deb on a couple of bugs on open-source hardware designs across multiple evolving variants.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122070352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Micro-architectural Enhancements in Distributed Memory CGRAs for LU and QR Factorizations 面向LU和QR分解的分布式内存CGRAs中的微体系结构增强
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.31
Farhad Merchant, Arka Maity, Mahesh Mahadurkar, Kapil Vatwani, Ishan Munje, C. MadhavaKrishna, N. Sivanandan, N. Gopalan, S. Raha, S. Nandy, R. Narayan
{"title":"Micro-architectural Enhancements in Distributed Memory CGRAs for LU and QR Factorizations","authors":"Farhad Merchant, Arka Maity, Mahesh Mahadurkar, Kapil Vatwani, Ishan Munje, C. MadhavaKrishna, N. Sivanandan, N. Gopalan, S. Raha, S. Nandy, R. Narayan","doi":"10.1109/VLSID.2015.31","DOIUrl":"https://doi.org/10.1109/VLSID.2015.31","url":null,"abstract":"LU and QR factorizations are the computationally dear part of many applications ranging from large scale simulations (e.g. Computational fluid dynamics) to augmented reality. These factorizations exhibit time complexity of O (n3) and are difficult to accelerate due to presence of bandwidth bound kernels, BLAS-1 or BLAS-2 (level-1 or level-2 Basic Linear Algebra Subprograms) along with compute bound kernels (BLAS-3, level-3 BLAS). On the other hand, Coarse Grained Reconfigurable Architectures (CGRAs) have gained tremendous popularity as accelerators in embedded systems due to their flexibility and ease of use. Provisioning these accelerators in High Performance Computing (HPC) platforms is the research challenge wrestled by the computer scientists. We consider a CGRA environment in which several Compute Elements (CEs) enhanced with Custom Functional Units (CFUs) are interconnected over a Network-on-Chip (NoC). In this paper, we carry out extensive micro-architectural exploration for accelerating core kernels like Matrix Multiplication (MM) (BLAS-3) for LU and QR factorizations. Our 5 different design enhancements lead to the reduction in the latency of BLAS-3 kernels. On a stand-alone CFU, we achieve up to 8x speed-up for MM. A commensurate improvement is observed for MM in a CGRA environment. We achieve better GF LOP S/mm2 compared to recent implementations.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129598869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Implementation of NOR Logic Based on Material Implication on CMOL FPGA Architecture 基于材料隐含的NOR逻辑在CMOL FPGA结构中的实现
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.94
P. Mane, Nishil Talati, Ameya Riswadkar, Bhavan Jasani, C. K. Ramesha
{"title":"Implementation of NOR Logic Based on Material Implication on CMOL FPGA Architecture","authors":"P. Mane, Nishil Talati, Ameya Riswadkar, Bhavan Jasani, C. K. Ramesha","doi":"10.1109/VLSID.2015.94","DOIUrl":"https://doi.org/10.1109/VLSID.2015.94","url":null,"abstract":"Memristor based nanocrossbar layer fabricated on CMOS layer has shown tremendous potential as high density memory and in reconfigurable logic architectures. Instead of having predesigned Configurable Logic Blocks (CLBs) and memory for reconfiguration as in FPGA, they can be instantiated in nanocrossbar memory as the need arises. We have shown in this paper, the novel design of NOR block as basic unit of computation for in-memory calculations to implement on CMOL FPGA architecture. This block implements its function using material implication. The proposed scheme is against the naturally arising boolean logic based NOR block in CMOL FPGA.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128287569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A High-Performance Energy-Efficient Hybrid Redundant MAC for Error-Resilient Applications 面向容错应用的高性能节能混合冗余MAC
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.65
Sunil Dutt, Anshu Chauhan, Rahul Bhadoriya, Sukumar Nandi, G. Trivedi
{"title":"A High-Performance Energy-Efficient Hybrid Redundant MAC for Error-Resilient Applications","authors":"Sunil Dutt, Anshu Chauhan, Rahul Bhadoriya, Sukumar Nandi, G. Trivedi","doi":"10.1109/VLSID.2015.65","DOIUrl":"https://doi.org/10.1109/VLSID.2015.65","url":null,"abstract":"In the majority of Digital Signal Processing (DSP) applications, such as image, audio and video processing, the final result is interpreted by human senses, and, the fact of confined perception of human senses declines the strict restriction on accuracy. Thus, by adopting the emerging concept of approximate computing, we propose an approximate radix-2 hybrid redundant Multiply-and-Accumulate (Approx MAC) unit which stems a novel Speed-Power-Accuracy-Area (SPAA) metrics. The Approx MAC unit attains tremendous improvements in computational performance, energy efficiency and silicon area with a trivial degradation in the output quality. To inspect the effectiveness of the proposed approach in real-time DSP applications, we demonstrate an Approx MAC unit embedded JPEG-E-X IP core architecture. The Approx MAC unit with 40 approximate LSBs ensures 7.177x and 1.526x speedup, 1.594x and 4.163x energy efficiency, and 1.131x and 1.277x silicon area improvements over binary and hybrid redundant MAC units, respectively. Moreover, the Approx MAC unit with 40 approximate LSBs decorates power precision and delay-precision metrics by 14.71% and 32.95%, respectively.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116167415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Way Halted Prediction Cache: An Energy Efficient Cache Architecture for Embedded Processors 方式停止预测缓存:嵌入式处理器的高能效缓存架构
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.16
Neethu Bal Mallya, Geeta Patil, B. Raveendran
{"title":"Way Halted Prediction Cache: An Energy Efficient Cache Architecture for Embedded Processors","authors":"Neethu Bal Mallya, Geeta Patil, B. Raveendran","doi":"10.1109/VLSID.2015.16","DOIUrl":"https://doi.org/10.1109/VLSID.2015.16","url":null,"abstract":"This paper proposes a novel cache architecture -- Way Halted Prediction -- to reduce energy consumption and effective access time of set associative caches. This is achieved with the help of halt tag array and prediction circuit. Experimental evaluation of various SPEC benchmark programs on CACTI 5.3 and CASIM simulators reveal that the proposed architecture offers 33%, 6% and 3% savings in dynamic energy consumption and 1.80%, 6.13% and -1.95% saving in effective access time over conventional, way predicting and way halting cache architectures respectively.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130814919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Using Boolean Tests to Improve Detection of Transistor Stuck-Open Faults in CMOS Digital Logic Circuits 利用布尔测试改进CMOS数字逻辑电路中晶体管卡开故障的检测
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.73
X. Lin, S. Reddy, J. Rajski
{"title":"Using Boolean Tests to Improve Detection of Transistor Stuck-Open Faults in CMOS Digital Logic Circuits","authors":"X. Lin, S. Reddy, J. Rajski","doi":"10.1109/VLSID.2015.73","DOIUrl":"https://doi.org/10.1109/VLSID.2015.73","url":null,"abstract":"Currently transistor stuck-open (TSOP) faults in CMOS digital logic circuits are detected by two pattern tests consisting of an initialization pattern to set the output of a faulty gate followed by a pattern that detects a stuck-at fault. Some TSOP faults may not be detected by such two-pattern tests. One reason for this is that appropriate initialization patterns cannot be obtained using Boolean (steady state) analysis of the circuit. For some of these faults, required initialization may be possible using hazards (glitches) [10][13]. However, insuring that a test using hazard-based initialization actually detects the target fault requires accurate transient analysis of the circuit under test such as by SPICE. In this work we propose methods to augment test generation procedures to detect TSOP faults using traditional steady state Boolean analysis (called Boolean tests in this work). We also investigate the cause for the non-existence of test patterns for the faults not detected in benchmark circuits. In many such cases we found that the non-existence of test patterns is due to redundant gates that can be replaced by a constant 1 or 0. We present results on larger ISCAS-89 benchmark circuits to illustrate the effectiveness of the proposed methods to generate tests to detect TSOP faults and the results of analysis for the non-existence of tests for the remaining faults undetected by Boolean tests.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121265023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
FPGA Based Scalable Fixed Point QRD Core Using Dynamic Partial Reconfiguration 基于FPGA的动态局部重构可扩展定点QRD核心
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.64
G. Prabhu, Bibin Johnson, J. S. Rani
{"title":"FPGA Based Scalable Fixed Point QRD Core Using Dynamic Partial Reconfiguration","authors":"G. Prabhu, Bibin Johnson, J. S. Rani","doi":"10.1109/VLSID.2015.64","DOIUrl":"https://doi.org/10.1109/VLSID.2015.64","url":null,"abstract":"This work presents an FPGA based scalable fixed point QRD architecture based on Givens Rotation algorithm.The proposed QRD core utilizes an efficient pipelined and unfolded 2D MAC based systolic array architecture with dynamic partial reconfiguration(DPR) capability. An improved LUT based Newton-Raphson method is proposed for finding square root and inverse square root which helps in reducing the area by 71% and latency by 50%, while operating at a frequency 49% higher than the existing boundary cell architectures. The scalability of the QRD core is achieved using DPR which results in reduction in dynamic power and area utilization as compared to a static implementation. The proposed architecture is implemented on Xilinx Virtex-6 FPGA for any real matrices of size m × n where, 4 ≤ n ≤ 8 and m ≥ n by dynamically inserting or removing the partial modules. The evaluation results shows reduction in latency, area and power as compared to CORDIC based architectures. The proposed scalable QRD core is used for implementing a high performance adaptive equalizer(QRD-RLS Algorithm) used in mobile receiver's and the evaluation is done by transmitting BPSK symbols in the training mode.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121274200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A Novel Ternary Content-Addressable Memory (TCAM) Design Using Reversible Logic 一种基于可逆逻辑的三元内容寻址存储器(TCAM)设计
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.99
S. D. Kumar, S. Mahammad
{"title":"A Novel Ternary Content-Addressable Memory (TCAM) Design Using Reversible Logic","authors":"S. D. Kumar, S. Mahammad","doi":"10.1109/VLSID.2015.99","DOIUrl":"https://doi.org/10.1109/VLSID.2015.99","url":null,"abstract":"Content addressable memory is a special type of memory which can do search operation in a single clock cycle. CAM has disadvantages of high power dissipation during the matching operation. Ternary content addressable memory (TCAM) is a special type of memory which is used to search for logic 0, logic 1, logic 'x'. These types of memory are used in routers in order to perform the lookup table function in a single clock cycle. As the use of networks, typified by the Internet, has spread widely in recent years, attention has focused on TCAMs as a key device for increasing the speed of packet forwarding (packet data transfers) by networking equipment by enabling high-speed lookup of destinations, etc., for large volumes of information during packet data transfers. Reversible logic has gained its interest in recent years due to its ultra low power characteristics. Many works have been done to reduce the power consumption in TCAM. This paper deals with a novel design of TCAM cells using reversible logic. The proposed design is optimized in terms of number of garbage outputs and quantum cost. The proposed TCAM cell does the function of the conventional TCAM cell.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114508893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Framework for Selective Flip-Flop Replacement for Soft Error Mitigation 用于软错误缓解的选择性触发器替换框架
2015 28th International Conference on VLSI Design Pub Date : 1900-01-01 DOI: 10.1109/VLSID.2015.70
Pavan Vithal Torvi, V. Devanathan, V. Kamakoti
{"title":"Framework for Selective Flip-Flop Replacement for Soft Error Mitigation","authors":"Pavan Vithal Torvi, V. Devanathan, V. Kamakoti","doi":"10.1109/VLSID.2015.70","DOIUrl":"https://doi.org/10.1109/VLSID.2015.70","url":null,"abstract":"With increasing adoption of newer technologies and architectures targeted for automotive and aviation electronics with an objective to improve performance and/or reduce power/area, soft-error robustness is becoming an important issue to ensure reliable operation for an extended lifetime over a wide range of operating conditions. In this paper, we propose a modeling and optimization framework to systematically improve the FIT (failure-in-time) rate of a design with minimal impact on power, performance and area. We first propose a framework to model and evaluate the relative vulnerability to soft errors of the standard master-slave flip-flops and Dual Interlocked Storage Cells (DICE) in the cell library. Later, we formulate a linear optimization problem using this information to selectively replace the flip-flops so as to improve the FIT rate of the design with minimal impact on area and power. Employing the proposed technique on a popular industrial IP core shows a 32% relative improvement in the design robustness with just 2% increase in design area.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125427472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信