Debdeep Mukhopadhyay, R. Chakraborty, Phuong Ha Nguyen, D. Sahoo
{"title":"Tutorial T7: Physically Unclonable Function: A Promising Security Primitive for Internet of Things","authors":"Debdeep Mukhopadhyay, R. Chakraborty, Phuong Ha Nguyen, D. Sahoo","doi":"10.1109/VLSID.2015.115","DOIUrl":"https://doi.org/10.1109/VLSID.2015.115","url":null,"abstract":"Summary form only given. Internet of Things (IoT) is a network of large number of uniquely identifiable intercommunicating “smart” devices that promise to transform our lives. Lightweight authentication protocols for resource {constrained smart devices should be secure against “physical attacks” such as Side-Channel Attack. Physically Unclonable Functions (PUFs) are a class of novel hardware security primitives that promise a paradigm shift in many security applications and protocols. In essence, a PUF circuit is a partially disordered system that has an instance{speci_c input{output behavior that cannot be replicated by manufacturing (hence physically unclonable\"). The unique features of PUFs avoid explicit key storage, and thus make them immune against many of the existing physical attacks which aim to divulge the secret key of cryptographic algorithms. However, the concept of PUF is not a panacea in the domain of security, and they are still vulnerable to several forms of intelligent attacks, using a combination of concepts borrowed from side-channel analysis and machine learning. In this tutorial, we would explore design challenges, operating principles, attacks and defence strategies for PUF circuits. The tutorial would cover the following topics: Fundamentals of PUF, Lightweight PUF Designs, Security Analysis of PUFs and PUF-based Authentication Protocols.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125800928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arun Joseph, A. Haridass, C. Lefurgy, Spandana Rachamalla, Sreekanth Pai, Diyanesh Chinnakkonda, Vidushi Goyal
{"title":"FirmLeak: A Framework for Efficient and Accurate Runtime Estimation of Leakage Power by Firmware","authors":"Arun Joseph, A. Haridass, C. Lefurgy, Spandana Rachamalla, Sreekanth Pai, Diyanesh Chinnakkonda, Vidushi Goyal","doi":"10.1109/VLSID.2015.84","DOIUrl":"https://doi.org/10.1109/VLSID.2015.84","url":null,"abstract":"Separating the dynamic power and leakage power components from total microprocessor power can enable new optimizations for cloud computing. To this end, we introduce FirmLeak, a new framework that enables accurate, real-time estimation of microprocessor leakage power by system software. FirmLeak accounts for power-gating regions, per-core voltage domains, and manufacturing variation. We present an experimental evaluation of FirmLeak on a POWER7+ microprocessor for a range of hardware parts, voltages and temperatures. We discuss how this can be used in two applications to manage power by 1) improving billing of energy for cloud computing and 2) optimizing fan power consumption.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130464534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Layer Exploration of Heterogeneous Multicore Processor Configurations","authors":"S. Sarma, N. Dutt","doi":"10.1109/VLSID.2015.30","DOIUrl":"https://doi.org/10.1109/VLSID.2015.30","url":null,"abstract":"Heterogeneous multicore processors (HMP) present significant advantages over homogenous multiprocessors due to their improved power, performance, and energy efficiency for a given chip/die area. However, due to their diverse and vast design space, selecting a suitable HMP configuration with different core types within a given area-power budget is an extremely challenging task. In this paper, we present a cross-layer approach for exploring and configuring a HMP for a given system goal under system level constraints (such as equal area or power budget) as an optimization problem. Unlike the state-of-the-art approaches, we jointly consider cross-layer features of the application, operating system (task allocation strategies), and hardware architecture while deploying computationally efficient predictive models (of performance and power) in configuring the HMP platform resources (number and types of cores) in an evolutionary optimization framework. Our predictive cross-layer approach enables the designer to comparatively evaluate and select the most promising (e.g., Energy and performance efficient) HMP configuration in over two order of magnitude less simulation time especially during the early design and verification stages when the design space is at its largest.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131802342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scaling the UVM_REG Model towards Automation and Simplicity of Use","authors":"A. Jain, Richa Gupta","doi":"10.1109/VLSID.2015.33","DOIUrl":"https://doi.org/10.1109/VLSID.2015.33","url":null,"abstract":"The standard UVM register package contains built-in test sequences library which is used to perform most of the basic register and memory tests. These sequences are very useful at IP level verification but at SoC level verification where number of registers are very large, these sequences take very long time to run. Similarly, currently users require strong knowledge of SV UVM language to use UVM_REG register model and verification environment code seems to be very complex to verification engineers/designers which are not expert in UVM. Some limitations in current version of UVM_REG package like no automatic data checking for memory accesses and limited support for memory burst operation were also seen. In this paper, we are describing how we addressed the above mentioned issues. We are accessing processor programmable registers and memories through a standard API (based on UVM_REG register model) used in test development. This API is aimed at writing simpler directed tests which require less or no SV/UVM understanding. This API can be used to facilitate dumping register access for reuse from IP to SoC, or format outputs for use in ATE test vectors development etc. In these APIs, basic to more complex OS based capability is provided. We also developed our own register/memory sequences to address the SoC level register and memory testing. Customized code is written to enhance the features of standard UVM_REG Register and Memory Model. IP-XACT based tools are also developed to automatically generate all required verification environment files for using standard register model. Verification Environments with UVM_REG register model integrated are used to verify a variety of devices covering various protocols, applications and domains as the Internet of Things (IoT).","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"245 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121283633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Noise Aware CML Latch Modelling for Large System Simulation","authors":"D. Bhatta, Suvadeep Banerjee, A. Chatterjee","doi":"10.1109/VLSID.2015.55","DOIUrl":"https://doi.org/10.1109/VLSID.2015.55","url":null,"abstract":"Modern high speed communication systems often employ both analog and digital blocks. This poses a challenge for simulation of closed loop system dynamics in presence of non-idealities in any of the analog blocks. Due to the size and complexity of such systems it is not possible to do full system level simulation with circuit level models. The presence of digital control blocks makes it difficult to elevate block level observations to system level performance. A major challenge is the difficulty in estimating the error rate at the output of digital latches (continuous-time to discrete-time domain crossing boundaries) in the presence of noise and non-ideal analog input signals. Simplistic models used currently are often inadequate in capturing the long term effects of non ideal behavior at the block level. In this paper we propose a simulation framework to estimate latch transition probabilities in the response to distorted input and clock waveforms in presence of white noise. The evaluated transition probabilities can then be used to estimate system performance in an event driven Markov chain based model.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"12 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116041713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dharanidhar Dang, B. Patra, R. Mahapatra, M. Fiers
{"title":"Mode-Division-Multiplexed Photonic Router for High Performance Network-on-Chip","authors":"Dharanidhar Dang, B. Patra, R. Mahapatra, M. Fiers","doi":"10.1109/VLSID.2015.24","DOIUrl":"https://doi.org/10.1109/VLSID.2015.24","url":null,"abstract":"The communication bandwidth and power consumption of network-on-chip (NoC) are going to meet their limits soon because of traditional metallic interconnects. Photonic NoC is emerging as a promising alternative to address these bottlenecks. Photonic routers and silicon-waveguides are used to realize switching and communication respectively. In this paper, we propose a non-blocking, low power, and high performance 5×5 photonic router design using silicon microring resonators (MRR). Mode-division-multiplexing (MDM) scheme has been incorporated along with wavelength-division-multiplexing (WDM) and time-division-multiplexing (TDM) in the router to increase the aggregate bandwidth 4× times, making it a suitable candidate for high performance NoC. The technique proposed here is the first of its kind to the best of our knowledge. The MDM based design permits multi-modal (here 2 modes) communication. As compared to a high-performance 45nm electronic router, the proposed router consumes 95% less power. Further the results show 50% less power consumption and 75% less insertion loss when compared to most recently reported photonic router results.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128754698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameterizable FPGA Framework for Particle Filter Based Object Tracking in Video","authors":"Pinalkumar Engineer, R. Velmurugan, S. Patkar","doi":"10.1109/VLSID.2015.11","DOIUrl":"https://doi.org/10.1109/VLSID.2015.11","url":null,"abstract":"Real-time particle filter based object tracking in videos on embedded platforms (FPGA) is challenging because of its resource usage and computational complexity. Furthermore, minor changes to the algorithm will need changes in the hardware. To address these issues, we propose a parametrizable FPGA framework for particle filter based object tracking algorithm. This parametrizable implementation can be used for various image sequences, object sizes and number of particles. By changing few parameters, this parametrization leads to appropriate changes in hardware resources resulting in efficient real-time operation of the algorithm. Experimental results show better tracking from the implementation and the proposed architecture can run particle filter algorithm for a color video sequence with 650 fps on average.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125800166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hardware and Thermal Analysis of DVFS in a Multi-core System with Hybrid WNoC Architecture","authors":"G. Harsha, H. Mondal, Sujay Deb","doi":"10.1109/VLSID.2015.25","DOIUrl":"https://doi.org/10.1109/VLSID.2015.25","url":null,"abstract":"Evolution of CMOS manufacturing technologies has led to billions of transistors per chip, many core and System-on-Chip (SoC) realizations in current day systems. But maintaining this trend is a significant challenge due to the power and thermal issues. As the devices are scaled and number of transistors on the chip increases, the power density across the chip increases rapidly with each generation. This further results in increased system temperature that can cause damage to the system. Dynamic Voltage/Frequency Scaling (DVFS) schemes reduce the power consumption without significant loss in system performance. In this paper, we design and evaluate a centralized DVFS control mechanism for multi core systems and discuss its merits and overheads. One of the major issues with centralized controller implementation is the long delays associated with signal transmission between the controller and different clusters in the system. To alleviate this issue, we use wireless interfaces for transmitting controller signals along with the data signals. Towards this goal, we design a dual band transceiver & antenna for the wireless interfaces and present their implementation details. Finally the thermal profile of the proposed DVFS mechanism is analyzed and compared with normal operating conditions.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125268580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Nonlinear Analytical Optimization Method for Standard Cell Placement of VLSI Circuits","authors":"Sameer Pawanekar, G. Trivedi, K. Kapoor","doi":"10.1109/VLSID.2015.77","DOIUrl":"https://doi.org/10.1109/VLSID.2015.77","url":null,"abstract":"We present an analytical method to perform VLSI standard cell placement. We have developed a placement engine based on analytical methods that makes use of non-linear programming. At first we cluster a net list to reduce the number of cells. In the second step we perform quadratic optimization on the reduced net list. Finally we use conjugate gradient method for solving non-linear equations for the problem. The framework of our tool, Kapees2, is scalable and generates high quality results. We obtain results for IBM version 2 benchmarks which show promising results. Our placer outperforms Capo, Amoeba, NTUPlace3 and feng shui by 7%, 12%, 2% and 1%, respectively.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125658017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Salvador, Siddharth Nilakantan, B. Taskin, Mark Hempstead, A. More
{"title":"Effects of Nondeterminism in Hardware and Software Simulation with Thread Mapping","authors":"G. Salvador, Siddharth Nilakantan, B. Taskin, Mark Hempstead, A. More","doi":"10.1109/VLSID.2015.27","DOIUrl":"https://doi.org/10.1109/VLSID.2015.27","url":null,"abstract":"In this paper, we explore the simulation performance trade-off under the lens of Monte Carlo design space exploration for multi-threaded programs and thread mapping. The vehicle used for this exploration will be a recent study, whose novel Google Page Rank-based thread mapping approach is compared to hundreds of random mappings, as well as a Round-Robin-based thread mapping approach proposed in this paper used in similar comparisons. The modern simulator landscape presents a choice between cycle-accurate but slow, and fast but inaccurate program simulation. We find that the use of a fast, inaccurate multi-threaded simulator, such as Sniper 5.3, suffers from large nondeterminism in the reported performance of the program. We perform cycle-accurate simulation which demonstrates that the static thread mapping approach does provide benefits in reaching near-optimal design points. Furthermore, the runtime of static thread mapping is significantly reduced using a cycle-accurate simulator compared to the full Monte Carlo exploration of mapping design points.","PeriodicalId":123635,"journal":{"name":"2015 28th International Conference on VLSI Design","volume":"53 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123197075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}