Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino
{"title":"n - core -一种高效和可扩展的深度学习方法","authors":"Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino","doi":"10.23919/VLSICircuits52068.2021.9492395","DOIUrl":null,"url":null,"abstract":"MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MN-Core - A Highly Efficient and Scalable Approach to Deep Learning\",\"authors\":\"Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino\",\"doi\":\"10.23919/VLSICircuits52068.2021.9492395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.\",\"PeriodicalId\":106356,\"journal\":{\"name\":\"2021 Symposium on VLSI Circuits\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Symposium on VLSI Circuits\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/VLSICircuits52068.2021.9492395\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Symposium on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/VLSICircuits52068.2021.9492395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MN-Core - A Highly Efficient and Scalable Approach to Deep Learning
MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.