{"title":"Nvidia Data Center Processing Unit (DPU) Architecture","authors":"Idan Burstein","doi":"10.1109/HCS52781.2021.9567066","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567066","url":null,"abstract":"NVIDIA DPU Enables the Data Center as the New Unit of Computing The CPU can no longer do it all Must offload & isolate server infrastructure tasks to a DPU Effective DPU must offer hardware acceleration and security isolation To enable such effective DPU, need to develop broad software eco-system to utilize hardware acceleration across variety of disciplines (e.g. HPC, AI/ML, Storage, Networking, Security) - DOCA NVIDIA DPU & DOCA is a computing platform with rich stack optimized ideal for AI, bare metal cloud, cloud supercomputing, storage, gaming, 5G wireless, and more NVIDIA is committed to line rate performance every generation.","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126586932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sung Joo Park, Jonghoon J. Kim, Kun Joo, Young-Ho Lee, Kyoungsun Kim, Young-Tae Kim, W. Na, I. Choi, Hye-Seung Yu, W. Kim, J. Jung, Jaejun Lee, Dohyung Kim, Young-Uk Chang, G. Han, Hangi-Jung, Sunwon Kang, Jeonghyeon Cho, H. Song, T. Oh, Y. Sohn, Sang-Wook Hwang, Jooyoung Lee
{"title":"Industry's First 7.2 Gbps 512GB DDR5 Module","authors":"Sung Joo Park, Jonghoon J. Kim, Kun Joo, Young-Ho Lee, Kyoungsun Kim, Young-Tae Kim, W. Na, I. Choi, Hye-Seung Yu, W. Kim, J. Jung, Jaejun Lee, Dohyung Kim, Young-Uk Chang, G. Han, Hangi-Jung, Sunwon Kang, Jeonghyeon Cho, H. Song, T. Oh, Y. Sohn, Sang-Wook Hwang, Jooyoung Lee","doi":"10.1109/HCS52781.2021.9567190","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567190","url":null,"abstract":"Spurred by the increasing market needs for big data and cloud services, global server suppliers and hyper- scalers are looking to adopt high-speed and large-capacity memory modules. To fulfill this trend, the brand- new low-voltage operable DDR5 (double data rate 5th generation) memory can be an appropriate solution, with the highest speed of 7.2 Gbps and the largest capacity of 512 GB. However, some critical obstacles, such as increased capacity and high-speed I/O requirements, unstable power noise occurrences, high power consumption, and increase in operating temperature, must be overcome. This poster will cover various technical pathfinding solutions for world's first DDR5 512 GB module with an advanced DRAM process and I/O schemes, package technology, and module architecture regarding improvements in the following four aspects: performance, speed, capacity, and power. This will unveil the industry's first high-performance and large-capacity memory product with 8-stacked DDR5 DRAMs. Samsung believes that this product will pave the way for achieving both higher bandwidth and lower power consumption to inaugurate the era of terabyte DRAM modules for next-gen servers.","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132272938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SOT-MRAM – Third generation MRAM memory opens new opportunities : Hot Chips Conference August 2021","authors":"Barry A. Hoberman, J. Nozieres","doi":"10.1109/HCS52781.2021.9567072","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567072","url":null,"abstract":"SOT is a straightforward extension of today’s ‘in production’ MRAM technologies running in major foundries. First memory technology to genuinely have the capability to converge both SRAM and NVM characteristics in advanced CMOS nodes $( le 28$ nm). Power, cost, and performance benefits of SOT are compelling. SOT is still in development, but look for market visibility to begin around 2024","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133619086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RDNA™ 2 Gaming Architecture","authors":"Andrew Pomianowski","doi":"10.1109/HCS52781.2021.9567555","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567555","url":null,"abstract":"This presentation contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) including, but not limited to, the features, functionality, availability, timing, expectations and expected benefits of AMD future products, including Ryzen™ 5000 Series CPUs and Socket AM4, which are made pursuant to the Safe Harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are commonly identified by words such as “would,“ “may,“ “expects,“ “believes,“ “plans,“ “intends,“ “projects“ and other terms with similar meaning. Investors are cautioned that the forward-looking statements in this presentation are based on current beliefs, assumptions and expectations, speak only as of the date of this presentation and involve risks and uncertainties that could cause actual results to differ materially from current expectations. Such statements are subject to certain known and unknown risks and uncertainties, many of which are difficult to predict and generally beyond AMD’s control, that could cause actual results and other future events to differ materially from those expressed in, or implied or projected by, the forward-looking information and statements. Investors are urged to review in detail the risks and uncertainties in AMD’s Securities and Exchange Commission filings, including but not limited to AMD’s Quarterly Report on Form 10-Qfrom the quarter ending on June 26, 2021.","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116780849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Ditzel, R. Espasa, Nivard Aymerich, Allen Baum, Tom Berg, Jim Burr, Eric Hao, Jayesh Iyer, Miquel Izquierdo, Shankar Jayaratnam, Darren Jones, Chris Klingner, Jin Kim, Stephen Lee, Marc Lupon, G. Magklis, Bojan Maric, Rajib Nath, Michael Neilly, Duane J. Northcutt, Bill Orner, Jose Renau, Gerard Reves, X. Revés, Tom Riordan, Pedro Sanchez, S. Samudrala, Guillem Sole, Raymond Tang, Tommy Thorn, Francisco Torres, S. Tortella, Daniel Yau
{"title":"Accelerating ML Recommendation with over a Thousand RISC-V/Tensor Processors on Esperanto’s ET-SoC-1 Chip","authors":"D. Ditzel, R. Espasa, Nivard Aymerich, Allen Baum, Tom Berg, Jim Burr, Eric Hao, Jayesh Iyer, Miquel Izquierdo, Shankar Jayaratnam, Darren Jones, Chris Klingner, Jin Kim, Stephen Lee, Marc Lupon, G. Magklis, Bojan Maric, Rajib Nath, Michael Neilly, Duane J. Northcutt, Bill Orner, Jose Renau, Gerard Reves, X. Revés, Tom Riordan, Pedro Sanchez, S. Samudrala, Guillem Sole, Raymond Tang, Tommy Thorn, Francisco Torres, S. Tortella, Daniel Yau","doi":"10.1109/HCS52781.2021.9566904","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9566904","url":null,"abstract":"The ET-SoC-1 has over a thousand RISC-V processors on a single TSMC 7nm chip, including: • 1088 energy-efficient ET-Minion 64-bit RISC-V in-order cores each with a vector/tensor unit • 4 high-performance ET-Maxion 64-bit RISC-V out-of-order cores • >160 million bytes of on-chip SRAM • Interfaces for large external memory with low-power LPDDR4x DRAM and eMMC FLASH • PCIe x8 Gen4 and other common I/O interfaces • Innovative low-power architecture and circuit techniques allows entire chip to • Compute at peak rates of 100 to 200 TOPS • Operate using under 20 watts for ML recommendation workloads","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"117 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128376641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Kim, Shinhaeng Kang, Sukhan Lee, Hyeonsu Kim, Woongjae Song, Yuhwan Ro, Seungwon Lee, David Wang, Hyunsung Shin, BengSeng Phuah, Jihyun Choi, J. So, Yeon-Gon Cho, Joonho Song, J. Choi, Jeonghyeon Cho, Kyomin Sohn, Y. Sohn, Kwang-il Park, N. Kim
{"title":"Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond","authors":"J. Kim, Shinhaeng Kang, Sukhan Lee, Hyeonsu Kim, Woongjae Song, Yuhwan Ro, Seungwon Lee, David Wang, Hyunsung Shin, BengSeng Phuah, Jihyun Choi, J. So, Yeon-Gon Cho, Joonho Song, J. Choi, Jeonghyeon Cho, Kyomin Sohn, Y. Sohn, Kwang-il Park, N. Kim","doi":"10.1109/HCS52781.2021.9567191","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567191","url":null,"abstract":"Using PIM to overcome memory bottleneck • Although various bandwidth increase methods have been proposed, it is physically impossible to achieve a breakthrough increase. - Limited by # of PCB wires, # of CPU ball, and thermal constraints • PIM has been proposed to improve performance of bandwidth-intensive workloads and improve energy efficiency by reducing computing-memory data movement.","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134201114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolay Nez, Antonio N. Vilchez, H. Zohouri, Oleg Khavin, Sakyasingha Dasgupta
{"title":"Dynamic Neural Accelerator for Reconfigurable & Energy-efficient Neural Network Inference","authors":"Nikolay Nez, Antonio N. Vilchez, H. Zohouri, Oleg Khavin, Sakyasingha Dasgupta","doi":"10.1109/HCS52781.2021.9566886","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9566886","url":null,"abstract":"Unique Challenges for AI Inference Hardware at the Edge • Peak TOPS or TOPS/Watt are not ideal measures of performance at the edge. Cannot prioritize performance over power efficiency (throughput/watt) • Many AI Hardware rely on batching to improve utilization. Unsuitable for streaming data (batch size 1) use-case at the edge • AI hardware architectures that fully cache network parameters using large on-chip SRAM cannot be scaled down easily to sizes applicable for edge workloads. • Need adaptability to new workloads and the ability to deploy multiple AI models • AI-specific accelerator needs to operate within heterogenous compute environments • Need for efficient compiler & scheduling to maximize compute utilization • Need for high software robustness and usability","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132753106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bradley Burres, Dan Daly, M. Debbage, Eliel Louzoun, Christine Severns-Williams, Naru Sundar, Nadav Turbovich, Barry Wolford, Yadong Li
{"title":"Intel’s Hyperscale-Ready Infrastructure Processing Unit (IPU)","authors":"Bradley Burres, Dan Daly, M. Debbage, Eliel Louzoun, Christine Severns-Williams, Naru Sundar, Nadav Turbovich, Barry Wolford, Yadong Li","doi":"10.1109/HCS52781.2021.9567455","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567455","url":null,"abstract":"Major Advantages of IPUs Separation of Infrastructure & Tenant Guest can fully control the CPU with their SW, while CSP maintains control of the infrastructure and Root of Trust Infrastructure Offload Accelerators help process these task efficiently. Minimize latency and jitter and maximize revenue from CPU Diskless Server Architecture Simplifies data center architecture while adding flexibility for the CSP","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114708692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sapphire Rapids","authors":"Arijit Biswas","doi":"10.1109/HCS52781.2021.9566865","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9566865","url":null,"abstract":"Next-Gen Intel Xeon Scalable Processor New Standard for Data Center Architecture Designed for Microservices & AI Workloads Pioneering Advanced Memory & IO Transitions","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114754409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Skydio Autonomy Engine: Enabling The Next Generation Of Autonomous Flight","authors":"A. Bachrach","doi":"10.1109/HCS52781.2021.9567400","DOIUrl":"https://doi.org/10.1109/HCS52781.2021.9567400","url":null,"abstract":"Drones hold the promise of massive positive impact Existing use cases are easier and more reliable New use cases that were previously impossible are enabled","PeriodicalId":246531,"journal":{"name":"2021 IEEE Hot Chips 33 Symposium (HCS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121162119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}