Sung Kim, Morteza Fayazi, A. Daftardar, Kuan-Yu Chen, Jielun Tan, S. Pal, T. Ajayi, Yan Xiong, T. Mudge, C. Chakrabarti, D. Blaauw, R. Dreslinski, Hun-Seok Kim
{"title":"Versa: A Dataflow-Centric Multiprocessor with 36 Systolic ARM Cortex-M4F Cores and a Reconfigurable Crossbar-Memory Hierarchy in 28nm","authors":"Sung Kim, Morteza Fayazi, A. Daftardar, Kuan-Yu Chen, Jielun Tan, S. Pal, T. Ajayi, Yan Xiong, T. Mudge, C. Chakrabarti, D. Blaauw, R. Dreslinski, Hun-Seok Kim","doi":"10.23919/VLSICircuits52068.2021.9492391","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492391","url":null,"abstract":"We present Versa, an energy-efficient processor with 36 systolic ARM Cortex-M4F cores and a runtime-reconfigurable memory hierarchy. Versa exploits algorithm-specific characteristics in order to optimize bandwidth, access latency, and data reuse. Measured on a set of kernels with diverse data access, control, and synchronization characteristics, reconfiguration between different Versa modes yields median energy-efficiency improvements of 11.6× and 37.2× over mobile CPU and GPU baselines, respectively.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133315626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gyeong-Gu Kang, Seok-Tae Koh, W. Jang, Ji-Hun Lee, Seongjoo Lee, O. Kwon, K. Jung, Hyunsik Kim
{"title":"A 12-Bit Mobile OLED/μLED Display Driver IC with Cascaded Loading-Free Capacitive Interpolation DAC and 6.24V/μs-Slew-Rate Buffer Amplifier","authors":"Gyeong-Gu Kang, Seok-Tae Koh, W. Jang, Ji-Hun Lee, Seongjoo Lee, O. Kwon, K. Jung, Hyunsik Kim","doi":"10.23919/VLSICircuits52068.2021.9492490","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492490","url":null,"abstract":"This paper presents an OLED/μLED display driver IC with cascaded loading-free capacitive interpolation (LFCI) DAC and a high-slew buffer amplifier. The 12-bit color-depth is realized by a combination of 7-bit R-DAC and proposed 5-bit LFCI DAC while occupying only 295×17μm2, which is ×2 reduction compared to the state-of-the-art. In-pixel MSB-conversion is also presented to reduce chip size further. 5V amplifier offers a slew-rate of 6.24V/μs at 80pF with a static current of 2μA. The chip fabricated in 180-nm achieved the measured 0.43LSB (DNL), 0.95LSB (INL), and 7.9mV (DVO).","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116683251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Khaddam-Aljameh, M. Stanisavljevic, J. F. Mas, G. Karunaratne, M. Braendli, Femg Liu, Abhairaj Singh, S. M. Müller, U. Egger, A. Petropoulos, T. Antonakopoulos, K. Brew, Samuel Choi, I. Ok, F. Lie, N. Saulnier, V. Chan, I. Ahsan, V. Narayanan, S. Nandakumar, M. L. Gallo, P. Francese, A. Sebastian, E. Eleftheriou
{"title":"HERMES Core – A 14nm CMOS and PCM-based In-Memory Compute Core using an array of 300ps/LSB Linearized CCO-based ADCs and local digital processing","authors":"R. Khaddam-Aljameh, M. Stanisavljevic, J. F. Mas, G. Karunaratne, M. Braendli, Femg Liu, Abhairaj Singh, S. M. Müller, U. Egger, A. Petropoulos, T. Antonakopoulos, K. Brew, Samuel Choi, I. Ok, F. Lie, N. Saulnier, V. Chan, I. Ahsan, V. Narayanan, S. Nandakumar, M. L. Gallo, P. Francese, A. Sebastian, E. Eleftheriou","doi":"10.23919/VLSICircuits52068.2021.9492362","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492362","url":null,"abstract":"We present a 256×256 in-memory compute (IMC) core designed and fabricated in 14nm CMOS with backend-integrated multi-level phase-change memory (PCM). It comprises 256 linearized current controlled oscillator (CCO)-based ADCs at a compact 4µm pitch and a local digital processing unit performing affine scaling and ReLU operations A novel frequency-linearization technique for CCOs is introduced, leading to accurate on-chip matrix-vector-multiply (MVM) when operating over 1 GHz. Measured classification accuracies on MNIST and CIFAR-10 datasets are presented when two cores are employed for deep learning (DL) inference The measured energy efficiency is 10.5 TOPS/W at a performance density of 1.59 TOPS/mm2.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114721731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino
{"title":"MN-Core - A Highly Efficient and Scalable Approach to Deep Learning","authors":"Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino","doi":"10.23919/VLSICircuits52068.2021.9492395","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492395","url":null,"abstract":"MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131981380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PETRA: A 22nm 6.97TFLOPS/W AIB-Enabled Configurable Matrix and Convolution Accelerator Integrated with an Intel Stratix 10 FPGA","authors":"Sung-gun Cho, Wei-Chien Tang, Chester Liu, Zhengya Zhang","doi":"10.23919/VLSICircuits52068.2021.9492517","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492517","url":null,"abstract":"PETRA is a configurable FP16 matrix multiplication and convolution accelerator designed to be 2.5D integrated using Advanced Interface Bus (AIB). PETRA is built upon four 16×16 systolic arrays, but it employs a configurable H-tree accumulation to improve both the latency and the utilization by up to 8×. A 22nm 3.04mm2 PETRA prototype provides 1.433TFLOPS in computing matrix-matrix multiplication (MMM) and convolution (conv) at 0.88V, and it achieves a 6.97TFLOPS/W peak efficiency at 0.7V. PETRA is integrated with an Intel Stratix 10 FPGA in a multi-chip package (MCP) to provide the flexibility of FPGA and the performance and efficiency of PETRA.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128215773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Disegni, A. Ventre, A. Molgora, P. Cappelletti, R. Badalamenti, P. Ferreira, G. Castagna, A. Cathelin, A. Gandolfo, A. Redaelli, D. Manfrè, A. Maurelli, C. Torti, F. Piazza, M. Carfì, F. Arnaud, M. Perroni, M. Caruso, S. Pezzini, R. Annunziata, G. Piazza, O. Weber, M. Peri
{"title":"16MB High Density Embedded PCM macrocell for automotive-grade microcontroller in 28nm FD-SOI, featuring extension to 24MB for Over-The-Air software update","authors":"F. Disegni, A. Ventre, A. Molgora, P. Cappelletti, R. Badalamenti, P. Ferreira, G. Castagna, A. Cathelin, A. Gandolfo, A. Redaelli, D. Manfrè, A. Maurelli, C. Torti, F. Piazza, M. Carfì, F. Arnaud, M. Perroni, M. Caruso, S. Pezzini, R. Annunziata, G. Piazza, O. Weber, M. Peri","doi":"10.23919/VLSICircuits52068.2021.9492465","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492465","url":null,"abstract":"This paper proposes a 16MB e-NVM macrocell for automotive grade 0 microcontroller based on a PCM cell with a Bipolar Transistor (BJT) selector. The solution is developed in proprietary 28nm FD-SOI CMOS technology, with a Super-STI scheme that has enabled high dense e-NVM with 0.019µm2 cell size [1]. Macrocell organization offers the capability to be configured by application either for Over-The-Air (OTA) mode up to 24MB, or for 16MB extra reliability mode, with two cells per bit, still resulting in an extremely competitive equivalent bit-cell size (0.038µm2). Cell Mode configuration can be dynamically tuned, with a unique set of features for flexible assisted OTA software update. The integration of a 16MB PCM cell array, extensible up to 24MB, in an automotive grade product-like test vehicle chip is presented here as the evolution of the first Embedded PCM macrocell for automotive [2], complementing, the fulfillment of all criteria in the demanding automotive environment [3] [4].","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128393475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Novello, Gabriele Atzeni, Giorgio Cristiano, Mathieu Coustans, Taekwang Jang
{"title":"A 2.3GHz Fully Integrated DC-DC Converter based on Electromagnetically Coupled Class-D LC Oscillators achieving 78.1% Efficiency in 22nm FDSOI CMOS","authors":"Alessandro Novello, Gabriele Atzeni, Giorgio Cristiano, Mathieu Coustans, Taekwang Jang","doi":"10.23919/VLSICircuits52068.2021.9492491","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492491","url":null,"abstract":"A fully integrated DC-DC converter based on electromagnetically coupled class-D LC oscillators achieving 0.42-3.2W/mm2 power density and 69.4-78.1% efficiency is demonstrated in a 22nm FDSOI CMOS technology. This work proposes on-chip 8-shaped and vertically stacked transformers, which are orthogonally placed for the high-power density, low undesired coupling coefficient and small electromagnetic interference (EMI) radiation. In addition, the output ripple is <10mV without attaching any output capacitor thanks to the 4-phase electromagnetic power delivery scheme. The converter also offers a duty cycled operation mode that enables <2% efficiency degradation down to 100μW. The total chip area is 0.59mm2 for 5.9nH inductance (high efficiency version) and 0.22mm2 for 3.9nH (high power density versions).","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133979960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Battery-Less IoT Sensor Node with PLL-Less WiFi Backscattering Communications in a 2.5-μW Peak Power Envelope","authors":"Longyang Lin, K. Ahmed, P. Salamani, M. Alioto","doi":"10.23919/VLSICircuits52068.2021.9492358","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492358","url":null,"abstract":"A system on chip including 802.11b WiFi communications is introduced to demonstrate battery-less operation for low-cost mm-scale sensor nodes. µW peak power is enabled by PLL-less WiFi backscattering communications and event-driven frequency regulation to compensate environmental variations. A 180nm testchip integrating the entire signal chain from any of four sensor interfaces to wireless communications with a commercial WiFi router exhibits 2.5µW total power.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"697 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133167054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyunjin Shin, Myeonghee Oh, Jaeseung Choi, T. Song, J. Kye
{"title":"A 28nm Embedded Flash Memory with 100MHz Read Operation and 7.42Mb/mm2 at 0.85V featuring for Automotive Application","authors":"Hyunjin Shin, Myeonghee Oh, Jaeseung Choi, T. Song, J. Kye","doi":"10.23919/VLSICircuits52068.2021.9492384","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492384","url":null,"abstract":"A 28nm embedded Flash memory in this paper is designed for the Automotive application in Foundry. Through Temperature Auto-Tracking Sense Amplifier using the Bit line Charge Boost (BCB) and Bit line Leakage current Compensation (BLC) technology, it succeeded in implementing under 10ns read operation (>100MHz) and size improvement (7.42Mb/mm2). Also Word Line and YMUX Gate Boost (WYGB) is applied to secure a sensing margin at a low voltage (0.85V). These techniques enable 10ns reading operation of 288 bits (26.8Gb/s) at a time based on 16Mb memory size by improving sensing margin in temperature range of -40~150’C. It also implemented a competitive minimum IP size and we have secured high yield that enough to mass production as a result of Silicon validation. Based on competitive advantage through technology differentiation, it will be provided to various customers in all eFlash IP Foundry markets including Automotive business.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116400675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}