{"title":"IBM CMOS compatible photonics and traveling wave electro-optic modulator design","authors":"D. Gill","doi":"10.1109/SLIP.2013.6681677","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681677","url":null,"abstract":"Summary form only given. This talk will give a general overview of the IBM Silicon Photonics program and specifically discuss CMOS compatible traveling wave electro-optic modulator design. A Non-Return-to-Zero Transmitter-link penalty calculation protocol for Mach-Zehnder Interferometric modulators based on the phase shifter efficiency-loss figure-of-merit will be presented. Our Transmitter-link penalty analysis protocol allows one to easily assess an expected penalty estimation from only the RF Vpp drive, modulator efficiency loss FOM, and the assumption that transmitter bandwidth is sufficient to support the link data rate, which allows system designers to better understand how device-level performance metrics impact system link budget.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114148168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Channel routing for integrated optics","authors":"Christopher Condrat, P. Kalla, S. Blair","doi":"10.1109/SLIP.2013.6681678","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681678","url":null,"abstract":"Increasing scope and applications of integrated optics necessitates the development of automated techniques for physical design of optical systems. This paper presents an automated, planar channel routing technique for integrated optical waveguides. Integrated optics is a planar technology and lacks the inherent signal restoration capabilities of static-CMOS. Therefore, signal loss minimization - as a function of waveguide crossings and bends-is the primary objective of this technique. This is in contrast to track and wire-length minimization of traditional VLSI routing. Our optical channel router guarantees minimal waveguide crossings by drawing upon sorting-based techniques for waveguide routing. To further improve our solutions in terms of signal loss, we extend the router to reduce the number of bends produced during routing. Finally, we implement the optical channel routing technique and describe the experimental results, comparing the costs of routing solutions with respect to waveguide crossings, bends, and channel height.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114510920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward quantifying the IC design value of interconnect technology improvements","authors":"T. Chan, A. Kahng, Jiajia Li","doi":"10.1109/SLIP.2013.6681680","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681680","url":null,"abstract":"As technology scales, wire delay due to interconnect resistance (R) and capacitance (C) is increasing. Thus, improvement of middle-of-line and back-end-of-line (BEOL) materials and process technology (e.g., to achieve reduced barrier material thickness or dielectric permittivity) has always been a key goal in the technology roadmap. However, to date there has not been any systematic quantification of the value of BEOL technology improvements on integrated circuit (LC) design metrics. In this work, we create a framework to study the impact of improvements in interconnect technology on IC designs. Using 45nm technology and benchmark designs from public sources, we map reductions of interconnect resistance and/or capacitance to resulting impacts on design power, performance and area - for various types of physical design and operating contexts. By quantifying potential benefits of interconnect technology improvements at a block or core level, our proposed framework complements lower-level (e.g., critical-path) projections. We believe that this type of early assessment can be useful to guide BEOL technology investments and targets, especially as technology improvements require ever-increasing resources and focus in R&D efforts.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130287239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Worst-case noise prediction using power network impedance profile","authors":"Xiang Zhang, Yang Liu, Chung-Kuan Cheng","doi":"10.1109/SLIP.2013.6681681","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681681","url":null,"abstract":"We propose a novel method to predict the worst-case noise using power distribution network impedance profile. Traditional target impedance method lacks of accuracy to estimate the worst-case noise. The convolution of impulse responses method can provide accurate noise prediction but cannot provide intuitive guidelines for the optimization. In this paper, we first analyze the ratio of the time-domain maximum output voltage noise to the multiplication of target impedance when time-domain maximum input current is confined to one. Particularly, for a typical PDN with two-stage or three-stage RLC tanks, the maximum ratio can be 2.09 and 2.72 respectively. We then propose our prediction in a standard RLC tank. We further extend it to analyze real PDN structures with multistage RLC tanks. Our results show that the proposed method can intuitively and accurately estimate the worst-case noise and provide straightforward design guidelines to improve PDN performance. For a typical lumped PDN with two-stage RLC tanks, the estimation error is within ±6%.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130410123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the scalability and performance of 3D stacked reconfigurable nanophotonic interconnects","authors":"R. Morris, Avinash Karanth Kodi, A. Louri","doi":"10.1109/SLIP.2013.6681676","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681676","url":null,"abstract":"As we integrate hundreds of cores in the future, energy-efficiency and scalability of Network-on-Chips (NoCs) has become a critical challenge. In order to achieve higher performance-per-Watt than traditional metallic interconnects, researchers are exploring alternate energy-effident emerging technology solutions. In this paper, we propose to combine two emerging technologies, namely 3D stacking and nanophotonics that can deliver high on-chip bandwidth and low energy/bit to achieve a high-throughput, reconfigurable and scalable NoC for many-core systems. Our simulation results indicate that the execution time can be reduced up to 25% and energy consumption reduced by 23% for Splash-2, PARSEC, SPEC CPU2006 and synthetic benchmarks for 64-core and 256-core versions.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130147393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chip-scale physical interconnect models (Tutorial)","authors":"R. Topaloglu","doi":"10.1109/SLIP.2013.6681684","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681684","url":null,"abstract":"Modeling layout-dependent interconnect processing steps is useful to predict integrated circuit design behavior. We illustrate key data and steps in developing etch, electrochemical deposition (ECD), and chemical-mechanical polishing (CMP) models in order to predict chip topography. We utilize an interferometer for validation of models for the first time. Such models are useful to select optimal fill algorithms using a novel DOE-based flow as proposed herein.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"68 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114042759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Kahng, Seokhyeong Kang, Hyein Lee, S. Nath, Jyoti Wadhwani
{"title":"Learning-based approximation of interconnect delay and slew in signoff timing tools","authors":"A. Kahng, Seokhyeong Kang, Hyein Lee, S. Nath, Jyoti Wadhwani","doi":"10.1109/SLIP.2013.6681682","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681682","url":null,"abstract":"Incremental static timing analysis (iSTA) is the backbone of iterative sizing and Vt-swapping heuristics for post-layout timing recovery and leakage power reduction. Performing such analysis through available interfaces of a signoff STA tool brings efficiency and functionality limitations. Thus, an internal iSTA tool must be built that matches the signoff STA tool. A key challenge is the matching of “black-box” modeling of interconnect effects in the signoff tool, so as to match wire slew, wire delay, gate slew and gate delay on each arc of the timing graph. Previous moment-based analytical models for gate and wire slew and delay typically have large errors when compared to values from signoff STA tools. To mitigate the accumulation of these errors and preserve timing correlation, sizing tools must invoke the signoff STA tool frequently, thus incurring large runtime costs. In this work, we pursue a learning-based approach to fit analytical models of wire slew and delay to estimates from a signoff STA tool. These models can improve the accuracy of delay and slew estimations, such that the number of invocations of the signoff STA tool during sizing optimizations is significantly reduced.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130511371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wireless on Networks-on-Chip","authors":"B. Taskin","doi":"10.1109/SLIP.2013.6681675","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681675","url":null,"abstract":"On-chip wireless interconnects are being investigated for applicability on network-on-chip systems of contemporary Multiprocessor Systems-on-chip (MPSoCs). Targeting both 2D and 3D semiconductor technologies, wireless interconnects are established with multiple antennas on the same die or couplers on the layers of a 3D IC package. The wireless interconnects are typically considered as a hierarchical layer or a supplementary network utilized in a hybrid implementation with the traditional wire-based interconnects of the common network-on-chip implementations.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115059762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-dimensional metamodeling for prediction of clock tree synthesis outcomes","authors":"A. Kahng, Bill Lin, S. Nath","doi":"10.1109/SLIP.2013.6681685","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681685","url":null,"abstract":"Clock tree synthesis (CTS) is a key aspect of on-chip interconnect, and major consumer of IC power and physical design resources. In modern sub-28nm tools and flows, it has become exceptionally difficult to satisfy skew, insertion delay and transition time constraints within power and area budgets, in part because commercial tools (with their many knobs) have become highly complex. This complexity, along with the complicated structure of real-world CTS instances (hierarchy, dividers, etc.) and floorplan contexts (aspect ratios, obstacles, etc.) make it very difficult to predict skew, power and other important metrics of CTS outcomes. In this work, we study CTS estimation in the high-dimensional parameter space of instance constraints and floorplan contexts. Using two leading commercial CTS tools as our testbed, we develop predictors, classifiers and “field of use” characterizations that can enable IC design teams to achieve required CTS solution quality through understanding of appropriate parameter subspaces. Our hierarchical hybrid surrogate modeling approach mitigates challenges of parameter multicollinearity in high dimensions. It achieves, e.g., worst-case estimation errors of 13% in contrast to 30% errors in [17]. We demonstrate use of a 94%-accurate “oracle” classifier and estimation models to predictably achieve CTS outcomes that meet specified constraints and target metrics.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117042420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizations in GPU: Smart compilers and core-level reconfiguration","authors":"Deming Chen","doi":"10.1109/SLIP.2013.6681686","DOIUrl":"https://doi.org/10.1109/SLIP.2013.6681686","url":null,"abstract":"Summary form only given. Graphics processing units (GPUs) are increasingly critical for general-purpose parallel processing performance. GPU hardware is composed of many streaming multiprocessors, allowing GPUs to execute tens of thousands of threads in parallel. However, due to the SIMD (single-instruction multiple-data) execution style, resource utilization and thus overall performance can be significantly affected if computation threads must take diverging control paths. Meanwhile, tuning GPU applications' performance is also a complex and labor intensive task. Software programmers employ a variety of optimization techniques to explore tradeoffs between the thread parallelism and performance of a single thread. New GPU architecture also allows concurrent kernel executions which introduces interesting kernel scheduling problems. In the first part of the talk, we will mainly introduce our recent studies on control flow optimization, joint optimization of register allocation and thread structure, and concurrent kernel scheduling, for GPU performance improvements. Energy efficiency of GPUs for general-purpose computing is increasingly important as well. The integration of GPUs onto SoCs for use in mobile devices in the last 5 years has further exacerbated the need to reduce the energy foot print of GPUs. In the second part of the talk, we propose a novel GPU architecture that makes use of reconfiguration to exploit ILP and DVFS (Dynamic Voltage and Frequency Scaling) techniques to reduce the power consumption, without sacrificing the computational throughput. We expect that applications with large amounts of ILP should see dramatic improvements in their energy and power, when compared to nominal CUDA-based architectures. In addition to this, we foresee interesting challenges with respect to scheduling of threads and the re-organization of CUDA warp structures and schedules. We also note that dynamic reconfiguration of cores within a SIMD unit (SM in CUDA), affects the number of threads that can execute concurrently and thus would change the number of effective warps in flight, which may affect the capability to overlap execution time and memory latency.","PeriodicalId":385305,"journal":{"name":"2013 ACM/IEEE International Workshop on System Level Interconnect Prediction (SLIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128220476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}