{"title":"A Heuristic Exploration of Retraining-free Weight-Sharing for CNN Compression","authors":"Etienne Dupuis, D. Novo, Ian O’Connor, A. Bosio","doi":"10.1109/ASP-DAC52403.2022.9712487","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712487","url":null,"abstract":"The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. The scientific literature provides a large number of approximation techniques to address this problem. Among them, the Weight-Sharing (WS) technique gives promising results, but it requires carefully determining the shared values for each layer of a given CNN. As the number of possible solutions grows exponentially with the number of layers, the WS Design Space Exploration (DSE) time can easily explode for state-of-the-art CNNs. In this paper, we propose a new heuristic approach to drastically reduce the exploration time without sacrificing the quality of the output. The results carried out on recent CNNs (GoogleNet [1], ResNet50V2 [2], MobileNetV2 [3], InceptionV3 [4], and EfficientNet [5]), trained with the ImageNet [6] dataset, show over 5× memory compression at an acceptable accuracy loss (complying with the MLPerf [7] quality target) without any retraining step and in less than 10 hours. Our code is publicly available on GitHub [8].","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114143550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacheng Huang, Min Peng, Libing Wu, C. Xue, Qingan Li
{"title":"Lamina: Low Overhead Wear Leveling for NVM with Bounded Tail","authors":"Jiacheng Huang, Min Peng, Libing Wu, C. Xue, Qingan Li","doi":"10.1109/asp-dac52403.2022.9712599","DOIUrl":"https://doi.org/10.1109/asp-dac52403.2022.9712599","url":null,"abstract":"Emerging non-volatile memory (NVM) has been considered as a promising candidate for the next generation memory architecture because of its excellent characteristics. However, the endurance of NVM is much lower than DRAM. Without additional wear management technology, its lifetime can be very short, which extremely limits the use of NVM. This paper observes that the tail wear with a very small percentage of extreme deviation significantly hurts the lifetime of NVM, which the existing methods do not effectively solve. We present Lamina to address the tail wear issue, in order to improve the lifetime of NVM. Lamina consists of two parts: bounded tail wear leveling (BTWL) and lightweight wear enhancement (LWE). BTWL is used to make the wear degree of all pages close to the average value and control the upper limit of tail wear. LWE improves the accuracy of BTWL by exploiting the locality to interpolate low-frequency sampling schemes in virtual memory space. Our experiments show that compared with the state-of-the-art methods, Lamina can significantly improve the lifetime of NVM with low overhead.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115015869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Liu, Xiaoyu Zhang, Xiaoming Chen, Yinhe Han, M. Tang
{"title":"FeMIC: Multi-Operands in-Memory Computing Based on FeFETs","authors":"R. Liu, Xiaoyu Zhang, Xiaoming Chen, Yinhe Han, M. Tang","doi":"10.1109/ASP-DAC52403.2022.9712498","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712498","url":null,"abstract":"The “memory wall” bottleneck caused by the performance gap between processors and memories is getting worse. Computing-in-memory (CiM), a promising technology to alleviate the “memory wall” bottleneck, has recently attracted much attention. Conventional CiM architectures based on emerging nonvolatile devices have a major drawback that they need ${N,-,1}$ clock cycles to complete a CiM operation with ${N}$ operands, as they are natively designed for processing two operands. In this work, we propose FeMIC, a new CiM architecture based on ferroelectric field-effect transistors (FeFETs), which natively supports the computation of multiple operands. For a CiM operation with ${N}$ operands, FeMIC only needs $leftlfloor {N/2} rightrfloor$ clock cycles. The simulation results based on a calibrated FeFET model reveal that FeMIC can significantly reduce the energy consumption when processing multi-operand CiM operations, compared with state-of-the-arts that use conventional CiM mechanisms.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123094387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxi Zhang, Qiuyan Gao, Yijiang Guo, Bizhao Shi, Guojie Luo
{"title":"EasyMAC: Design Exploration-Enabled Multiplier-Accumulator Generator Using a Canonical Architectural Representation: (Invited Paper)","authors":"Jiaxi Zhang, Qiuyan Gao, Yijiang Guo, Bizhao Shi, Guojie Luo","doi":"10.1109/ASP-DAC52403.2022.9712519","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712519","url":null,"abstract":"Multiplier-accumulator (MAC) is a crucial arithmetic element widely used in digital integrated circuits. Customized MACs are necessary for different scenarios but need great effort due to the huge architecture design space. In this paper, we develop EasyMAC, a flexible Chisel-based MAC generator with a canonical architectural representation. We design a compact and canonical sequence representation to express the architecture of MACs. And the MAC generator takes the compact representation as input to gain the Verilog codes. We also give a case study on developing a heuristic design space exploration (DSE) method based on this representation. The experimental result shows the effectiveness of the representation in DSE. Using the percent relative range of the power-delay-area product as a metric to measure the optimization opportunities that this representation exposes, the relative range is 17.4% and 23.1% for 16×16 and 25×18 MACs, respectively. At last, we discuss some promising directions of EasyMAC.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130002927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nibedita Karmokar, Meghna Madhusudan, A. Sharma, R. Harjani, Mark Po-Hung Lin, S. Sapatnekar
{"title":"Common-Centroid Layout for Active and Passive Devices: A Review and the Road Ahead","authors":"Nibedita Karmokar, Meghna Madhusudan, A. Sharma, R. Harjani, Mark Po-Hung Lin, S. Sapatnekar","doi":"10.1109/ASP-DAC52403.2022.9712576","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712576","url":null,"abstract":"This paper presents an overview of common-centroid (CC) layout styles, used in analog designs to overcome the impact of systematic variations. CC layouts must be carefully engineered to minimize the impact of mismatch. Algorithms for CC layout must be aware of routing parasitics, layout-dependent effects (for active devices), and the performance impact of layout choices. The optimal CC layout further depends on factors such as the choice of the unit device and the relative impact of uncorrelated and systematic variations. The paper also examines scenarios where non-CC layouts may be preferable to CC layouts.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130039706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zi-Han Xu, Lingfeng Yin, Yongqiang Lyu, Haixia Wang, Gang Qu, Dongsheng Wang
{"title":"CacheGuard: A Behavior Model Checker for Cache Timing Side-Channel Security: (Invited Paper)","authors":"Zi-Han Xu, Lingfeng Yin, Yongqiang Lyu, Haixia Wang, Gang Qu, Dongsheng Wang","doi":"10.1109/ASP-DAC52403.2022.9712560","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712560","url":null,"abstract":"Defending cache timing side-channels has become a major concern in modern secure processor designs. However, a formal method that can completely check if a given cache design can defend against timing side-channel attacks is still absent. This study presents CacheGuard, a behavior model checker for cache timing side-channel security. Compared to current state-of-the-art prose rule-based security analysis methods, CacheGuard covers the whole state space for a given cache design to discover unknown side-channel attacks. Checking results on standard cache and state-of-the-art secure cache designs discovers 5 new attack strategies, and potentially makes it possible to develop a timing side channel-safe cache with the aid of CacheGuard.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124498018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziwei Li, Han Xu, Zheyu Liu, Li Luo, Qi Wei, Fei Qiao
{"title":"A 2.17μW@120fps Ultra-Low-Power Dual-Mode CMOS Image Sensor with Senputing Architecture","authors":"Ziwei Li, Han Xu, Zheyu Liu, Li Luo, Qi Wei, Fei Qiao","doi":"10.1109/ASP-DAC52403.2022.9712591","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712591","url":null,"abstract":"This paper proposes an ultra-low-power CMOS Image Sensor (CIS) chip based on sensing-with-computing (Senputing) architecture to reduce the power bottleneck of vision system. This Senputing chip achieves BNN 1st-layer convolution in analog domain with ultra-low power consumption. It has two working modes, Normal-Sensor (NS) mode and Direct- Photocurrent-Computation (DPC) mode. The prototype measurement results under 65nm CMOS process on MNIST classification task shows that the power of feature map computation is 2.17μW with 120fps frame rates and 98.1% accuracy. The computation efficiency reaches to 11.49TOPs/W, which is 14.8× higher than state-of-art works.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116192191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuyue Lan, Zhilu Wang, John Mamish, Josiah D. Hester, Qi Zhu
{"title":"AdaSens: Adaptive Environment Monitoring by Coordinating Intermittently-Powered Sensors","authors":"Shuyue Lan, Zhilu Wang, John Mamish, Josiah D. Hester, Qi Zhu","doi":"10.1109/ASP-DAC52403.2022.9712501","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712501","url":null,"abstract":"Perceiving the environment for better and more efficient situational awareness is essential in applications such as wildlife surveillance, wildfire detection, crop irrigation, and building management. Energy-harvesting, intermittently-powered sensors have emerged as a zero maintenance solution for long-term environmental perception. However, these devices suffer from intermittent and varying energy supply, which presents three major challenges for executing perceptual tasks: (1) intelligently scaling computation in light of constrained resources and dynamic energy availability, (2) planning communication and sensing tasks, (3) and coordinating sensor nodes to increase the total perceptual range of the network. We propose an adaptive framework, AdaSens, which adapts the operations of intermittently-powered sensor nodes in a coordinated manner to cover as much as possible of the targeted scene, both spatially and temporally, under interruptions and constrained resources. We evaluate AdaSens on a real-world surveillance video dataset, VideoWeb, and show at least 16% improvement on the coverage of the important frames compared with other methods.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116533270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Net Separation-Oriented Printed Circuit Board Placement via Margin Maximization","authors":"Chung-Kuan Cheng, Chia-Tung Ho, Chester Holtz","doi":"10.1109/ASP-DAC52403.2022.9712480","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712480","url":null,"abstract":"Packaging has become a crucial process due to the paradigm shift of More than Moore. Addressing manufacturing and yield issues is a significant challenge for modern layout algorithms. We propose to use printed circuit board (PCB) placement as a benchmark for the packaging problem. A maximum-margin formulation is devised to improve the separation between nets. Our framework includes seed layout proposals, a coordinate descent-based procedure to optimize routability, and a mixed-integer linear programming method to legalize the layout. We perform an extensive study with 14 PCB designs and an open-source router. We show that the placements produced by NS-place improve routed wirelength by up to 25%, reduce the number of vias by up to 50%, and reduce the number of DRVs by 79% compared to manual and wirelength-minimal placements.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126360314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Side-Channel Hardware Trojan in 65nm CMOS with $2mumathrm{W}$ precision and Multi-bit Leakage Capability","authors":"T. Perez, S. Pagliarini","doi":"10.1109/ASP-DAC52403.2022.9712490","DOIUrl":"https://doi.org/10.1109/ASP-DAC52403.2022.9712490","url":null,"abstract":"In this work, a novel architecture for a side-channel trojan (SCT) capable of leaking multiple bits per power signature reading is proposed. This trojan is inserted utilizing a novel framework featuring an Engineering Change Order (ECO) flow. For assessing our methodology, a testchip comprising of two versions of the AES and two of the Present (PST) crypto cores is manufactured in 65nm commercial technology. Our results from the hardware validation demonstrated that keys are successfully leaked by creating microwatt-sized shifts in the power consumption.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"469 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125837816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}