{"title":"Chiplet-Based Advanced Packaging Technology from 3D/TSV to FOWLP/FHE","authors":"T. Fukushima","doi":"10.23919/VLSICircuits52068.2021.9492335","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492335","url":null,"abstract":"More recently, \"chiplets\" are expected for further scaling the performance of LSI systems. However, system integration with the chiplets is not a new methodology. The basic concept dates back well over a few decades. The symbolic configuration of this concept based on the chiplets is 3D integration with TSV we have worked on since 1989. This paper introduces our 3D and heterogeneous system integration research from its historical activities to the latest efforts, including capillary self-assembly of tiny dies with a size of less than 0.1 mm and advanced flexible hybrid electronics (FHE) using fan-out wafer-level packaging (FOWLP).","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116547649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shota Iizuka, Kimihiko Kato, A. Yagishita, H. Asai, T. Ueda, H. Oka, J. Hattori, T. Ikegami, K. Fukuda, T. Mori
{"title":"Buried nanomagnet realizing high-speed/low-variability silicon spin qubits: implementable in error-correctable large-scale quantum computers","authors":"Shota Iizuka, Kimihiko Kato, A. Yagishita, H. Asai, T. Ueda, H. Oka, J. Hattori, T. Ikegami, K. Fukuda, T. Mori","doi":"10.23919/VLSICircuits52068.2021.9492449","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492449","url":null,"abstract":"We propose a buried nanomagnet (BNM) realizing high-speed/low-variability silicon spin qubit operation, inspired by buried wiring technology, for the first time. High-speed quantum-gate operation results from large slanting magnetic-field generated by the BNM disposed quite close to a spin qubit, and low-variation of fidelity thanks to the self-aligned fabrication process. Employing TCAD-based simulation, we demonstrate that the BNM realizes 10 times faster Rabi oscillation (faster spin-flip) than previous works and >99% fidelity under certain process variations. Also, the proposed BNM arrangement is implementable for error-correctable large-scale quantum computers employing a 2D-latticed qubit layout. This technology paves the way to practical large-scale quantum computers with silicon.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122092579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Lagos, N. Markulić, B. Hershberg, D. Dermit, M. Shrivas, E. Martens, J. Craninckx
{"title":"A 10.0 ENOB, 6.2 fJ/conv.-step, 500 MS/s Ringamp-Based Pipelined-SAR ADC with Background Calibration and Dynamic Reference Regulation in 16nm CMOS","authors":"J. Lagos, N. Markulić, B. Hershberg, D. Dermit, M. Shrivas, E. Martens, J. Craninckx","doi":"10.23919/VLSICircuits52068.2021.9492354","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492354","url":null,"abstract":"We present a single-channel fully-dynamic pipelined SAR ADC that leverages a novel quantizer and narrowband dither injection to achieve fast and comprehensive background calibration of DAC mismatch, interstage gain, and ring amplifier (ringamp) bias optimality. The ADC also includes an on-chip wide-range, fully-dynamic reference regulation system. Consuming 3.3 mW at 500 MS/s, it achieves 10.0 ENOB and 75.5 dB SFDR, yielding a Walden FoM of 6.2 fJ/c.s.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125659740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 6.54-to-26.03 TOPS/W Computing-In-Memory RNN Processor using Input Similarity Optimization and Attention-based Context-breaking with Output Speculation","authors":"Ruiqi Guo, Hao Li, Ruhui Liu, Zhixiao Zhang, Limei Tang, Hao Sun, Leibo Liu, Meng-Fan Chang, Shaojun Wei, S. Yin","doi":"10.23919/VLSICircuits52068.2021.9492492","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492492","url":null,"abstract":"This work presents a 65nm RNN processor with computing-inmemory (CIM) macros. The main contributions include: 1) A similarity analyzer (SimAyz) to fully leverage the temporal stability of input sequences with 1.52× performance speedup; 2) An attention-based context-breaking (AttenBrk) method with output speculation to reduce off-chip data accesses up to 30.3%; 3) A double-buffering scheme for CIM macros to hide writing latency and a pipeline processing element (PE) array to increase the system throughput. Measured results show that this chip achieves 6.54-to-26.03 TOPS/W energy efficiency vary from various LSTM benchmarks.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130816612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Saito, T. Kobayashi, H. Koga, N. Ronchi, K. Banerjee, Y. Shuto, J. Okuno, Kenta Konishi, L. Piazza, A. Mallik, J. V. Houdt, M. Tsukamoto, K. Ohkuri, T. Umebayashi, T. Ezaki
{"title":"Analog In-memory Computing in FeFET-based 1T1R Array for Edge AI Applications","authors":"D. Saito, T. Kobayashi, H. Koga, N. Ronchi, K. Banerjee, Y. Shuto, J. Okuno, Kenta Konishi, L. Piazza, A. Mallik, J. V. Houdt, M. Tsukamoto, K. Ohkuri, T. Umebayashi, T. Ezaki","doi":"10.23919/VLSICircuits52068.2021.9492479","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492479","url":null,"abstract":"Deep neural network (DNN) inference for edge AI requires low-power operation, which can be achieved by implementing massively parallel matrix-vector multiplications (MVM) in the analog domain on a highly resistive memory array. We propose a 1T1R compute cell (1T1R-cell) using a ferroelectric hafnium oxide-based FET (FeFET) and TiN/SiO2 tunneling junction of MΩ resistor (MOR) for analog in-memory computing (AiMC). The MOR exhibited a tunneling current behavior and MΩ resistance. A 1T1R-cell array-level evaluation was also performed. A random access for writing with low write disturbance scheme was confirmed from the summation-DC-current output, and binaries were successfully classified into “T” and “L.” Based on the experimental results of our proposed 1T1R-cell, we obtained a state-of-the-art energy efficiency of 13700 TOPS/W including the periphery. Furthermore, we confirmed that a high inference accuracy can be obtained with our low-resistance-variability 1T1R-cell with a properly trained model.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126459073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Wade, Daniel Jeong, Byungchae Kim, Mason Zhang, Woorham Bae, Chong Zhang, Pavan Bhargava, D. V. Orden, S. Ardalan, C. Ramamurthy, E. Anderson, Austin Katzin, Haiwei Lu, S. Buchbinder, Behrooz Beheshtian, A. Khilo, M. Rust, Chen Li, Forrest Sedgwick, J. Fini, Roy Meade, V. Stojanović, Chen Sun
{"title":"Monolithic Microring-based WDM Optical I/O for Heterogeneous Computing","authors":"M. Wade, Daniel Jeong, Byungchae Kim, Mason Zhang, Woorham Bae, Chong Zhang, Pavan Bhargava, D. V. Orden, S. Ardalan, C. Ramamurthy, E. Anderson, Austin Katzin, Haiwei Lu, S. Buchbinder, Behrooz Beheshtian, A. Khilo, M. Rust, Chen Li, Forrest Sedgwick, J. Fini, Roy Meade, V. Stojanović, Chen Sun","doi":"10.23919/VLSICircuits52068.2021.9492382","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492382","url":null,"abstract":"For the first time, we demonstrate an error-free, 128Gbps (8x16Gbps) optical transceiver using a microring-based wavelength-division multiplexed (WDM) architecture. The optical transceiver ran for 12 hours with zero errors, resulting in a measured bit-error rate of <1.45e-15 per optical lane. The total number of bits sent during this time was ~691 terabits per lane and ~5.5 petabits aggregate across all lanes.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117199053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junkang Zhu, Wei-Chien Tang, Ching-En Lee, Haolei Ye, Eric C. McCreath, Zhengya Zhang
{"title":"VOTA: A 2.45TFLOPS/W Heterogeneous Multi-Core Visual Object Tracking Accelerator Based on Correlation Filters","authors":"Junkang Zhu, Wei-Chien Tang, Ching-En Lee, Haolei Ye, Eric C. McCreath, Zhengya Zhang","doi":"10.23919/VLSICircuits52068.2021.9492379","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492379","url":null,"abstract":"VOTA is a domain-specific accelerator for correlation filter (CF)-based visual object tracking (VOT). It encompasses a Winograd convolution core, a FFT core and a vector core in a high-bandwidth starring topology. VOTA’s frame-based instructions and execution enable a 537GFLOPS performance and reduce the code size. An instruction-chaining mechanism permits inter-core pipelining to improve the utilization to 84.2%. A 10.2mm2 28nm FP16 VOTA prototype incorporating a RISC-V host CPU is measured to achieve 2.45TFLOPS/W at 0.72V. Running OPCF, a CF-based VOT enhanced by adaptive boosting and particle filters, the chip achieves 1157FPS on 640×480 input frames at 0.9V and 175MHz, consuming 296mW.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123543574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. D. Stuyck, Roy Li, C. Godfrin, A. Elsayed, S. Kubicek, J. Jussot, B. Chan, F. Mohiyaddin, M. Shehata, G. Simion, Y. Canvel, L. Goux, Heyns Heyns, B. Govoreanu, I. Radu
{"title":"Uniform Spin Qubit Devices with Tunable Coupling in an All-Silicon 300 mm Integrated Process","authors":"N. D. Stuyck, Roy Li, C. Godfrin, A. Elsayed, S. Kubicek, J. Jussot, B. Chan, F. Mohiyaddin, M. Shehata, G. Simion, Y. Canvel, L. Goux, Heyns Heyns, B. Govoreanu, I. Radu","doi":"10.23919/VLSICircuits52068.2021.9492427","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492427","url":null,"abstract":"Larger arrays of electron spin qubits require radical improvements in fabrication and device uniformity. Here we demonstrate excellent qubit device uniformity and tunability from 300K down to mK temperatures. This is achieved, for the first time, by integrating an overlapping polycrystalline silicon-based gate stack in an ‘all-Silicon’ and lithographically flexible 300mm flow. Low-disorder Si/SiO2 is proved by a 10K Hall mobility of 1.5·104 cm2/Vs. Well-controlled sensors with low charge noise (3.6 µeV/√Hz at 1 Hz) are used for charge sensing down to the last electron. We demonstrate excellent and reproducible interdot coupling control over nearly 2 decades (2-100 GHz). We show spin manipulation and single-shot spin readout, extracting a valley splitting energy of around 150 µeV. These low-disorder, uniform qubit devices and 300mm fab integration pave the way for fast scale-up to large quantum processors.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115352719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced Multi-NIR Spectral Image Sensor with Optimized Vision Sensing System and Its Impact on Innovative Applications","authors":"H. Sumi, H. Takehara, J. Ohta, M. Ishikawa","doi":"10.23919/VLSICircuits52068.2021.9492443","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492443","url":null,"abstract":"Innovative applications with multiple near-infrared (multi-NIR) spectral CMOS image sensors (CIS) and camera systems have recently been developed. The multi-NIR filter is an indispensably key technology in practical of using the multi-NIR camera system in consumer camera. Advanced processing technology for multi-NIR signals has been developed using a Fabry-Perot structure. Three types of NIR wavelength filters are formed as a Bayer pattern with 2-x-2μm2 pixel size on a 5-M pixel BSI-CIS. The thickness differences of the three types of bandpass filters are suppressed to less than 75 nm. To enable applications in surveillance, automobiles, and fundus cameras for health management, signal processing technology has also been developed that processes and mixes each signal of a multi-NIR signal with low-intensity visible light images. This provides good image SNR (Signal-to-Noise Ratio) under low lighting conditions of 0.1 lux or less allowing changes of state to be easily identified.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"25 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120889898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Brown, G. Dogiamis, Yi-Shin Yeh, D. Correas-Serrano, Triveni S. Rane, Surej Ravikumar, Jessica C. Chou, V. Neeli, Mauricio Marulanda, N. P. Gaunkar, Cho-Ying Lu, I-Lun Huang, R. Sadhwani, Hyung-Jin Lee, H. Chandrakumar, Jeffery W. Bates, Zinia Tuli, Q. Yu, M. Weiss, J. Rangaswamy, C. Nieva, D. Frolov, T. Kamgaing, Y. S. Nam, H. Braunisch, S. Rami
{"title":"A 50 Gbps 134 GHz 16 QAM 3 m Dielectric Waveguide Transceiver System Implemented in 22nm CMOS","authors":"T. Brown, G. Dogiamis, Yi-Shin Yeh, D. Correas-Serrano, Triveni S. Rane, Surej Ravikumar, Jessica C. Chou, V. Neeli, Mauricio Marulanda, N. P. Gaunkar, Cho-Ying Lu, I-Lun Huang, R. Sadhwani, Hyung-Jin Lee, H. Chandrakumar, Jeffery W. Bates, Zinia Tuli, Q. Yu, M. Weiss, J. Rangaswamy, C. Nieva, D. Frolov, T. Kamgaing, Y. S. Nam, H. Braunisch, S. Rami","doi":"10.23919/VLSICircuits52068.2021.9492497","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492497","url":null,"abstract":"A 134 GHz 16 QAM fully-packaged transceiver system for dielectric waveguides with >12 GHz of RF bandwidth built in 22nm CMOS achieves a measured EVM of -19.8 dB (~5x10-6 BER) at a reach of 3 meters at a 50 Gbps data rate at a total power consumption of 494 mW from a 1.0 V supply. It achieves a FoM of 3.3 pJ/bit/m with the highest reported data rate at a distance greater than 2 m to date.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126450211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}