Kazutoshi Hirose, Jaehoon Yu, Kota Ando, Yasuyuki Okoshi, Ángel López García-Arias, Jun Suzuki, Thiem Van Chu, Kazushi Kawamura, Masato Motomura
{"title":"Hiddenite:利用片上模型构建的4K-PE隐藏网络推理4d张量引擎,在CIFAR-100和ImageNet上实现34.8- 16.0 tops /W","authors":"Kazutoshi Hirose, Jaehoon Yu, Kota Ando, Yasuyuki Okoshi, Ángel López García-Arias, Jun Suzuki, Thiem Van Chu, Kazushi Kawamura, Masato Motomura","doi":"10.1109/ISSCC42614.2022.9731668","DOIUrl":null,"url":null,"abstract":"Since the advent of the Lottery Ticket Hypothesis [1], which advocates the existence of embedded sparse models that achieve accuracies equivalent to the original dense model, new algorithms to find such subnetworks have been attracting attention. In particular, Hidden Network (HNN) [2] proposed a training method that finds such accurate subnetworks (Fig. 15.4.1). HNN extracts the sparse subnetwork by taking a logical AND of an initial model's random weights and a binary mask that defines the selected connections - a supermask. The importance of each connection, quantified as a score, is evaluated in the training phase; a supermask is learned by picking the connections with the top-k% highest scores. Although similar to pruning, supermask training is clearly different in that it never updates the initial random weights.","PeriodicalId":6830,"journal":{"name":"2022 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"28 4 Suppl 15 1","pages":"1-3"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Hiddenite: 4K-PE Hidden Network Inference 4D-Tensor Engine Exploiting On-Chip Model Construction Achieving 34.8-to-16.0TOPS/W for CIFAR-100 and ImageNet\",\"authors\":\"Kazutoshi Hirose, Jaehoon Yu, Kota Ando, Yasuyuki Okoshi, Ángel López García-Arias, Jun Suzuki, Thiem Van Chu, Kazushi Kawamura, Masato Motomura\",\"doi\":\"10.1109/ISSCC42614.2022.9731668\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since the advent of the Lottery Ticket Hypothesis [1], which advocates the existence of embedded sparse models that achieve accuracies equivalent to the original dense model, new algorithms to find such subnetworks have been attracting attention. In particular, Hidden Network (HNN) [2] proposed a training method that finds such accurate subnetworks (Fig. 15.4.1). HNN extracts the sparse subnetwork by taking a logical AND of an initial model's random weights and a binary mask that defines the selected connections - a supermask. The importance of each connection, quantified as a score, is evaluated in the training phase; a supermask is learned by picking the connections with the top-k% highest scores. Although similar to pruning, supermask training is clearly different in that it never updates the initial random weights.\",\"PeriodicalId\":6830,\"journal\":{\"name\":\"2022 IEEE International Solid- State Circuits Conference (ISSCC)\",\"volume\":\"28 4 Suppl 15 1\",\"pages\":\"1-3\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Solid- State Circuits Conference (ISSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSCC42614.2022.9731668\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Solid- State Circuits Conference (ISSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSCC42614.2022.9731668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hiddenite: 4K-PE Hidden Network Inference 4D-Tensor Engine Exploiting On-Chip Model Construction Achieving 34.8-to-16.0TOPS/W for CIFAR-100 and ImageNet
Since the advent of the Lottery Ticket Hypothesis [1], which advocates the existence of embedded sparse models that achieve accuracies equivalent to the original dense model, new algorithms to find such subnetworks have been attracting attention. In particular, Hidden Network (HNN) [2] proposed a training method that finds such accurate subnetworks (Fig. 15.4.1). HNN extracts the sparse subnetwork by taking a logical AND of an initial model's random weights and a binary mask that defines the selected connections - a supermask. The importance of each connection, quantified as a score, is evaluated in the training phase; a supermask is learned by picking the connections with the top-k% highest scores. Although similar to pruning, supermask training is clearly different in that it never updates the initial random weights.