{"title":"EDAML 2022 Invited Speaker 2: AI Algorithm and Accelerator Co-design for Computing on the Edge","authors":"Deming Chen","doi":"10.1109/IPDPSW55747.2022.00195","DOIUrl":null,"url":null,"abstract":"In a conventional top-down design flow, deep-learning algorithms are first designed concentrating on the model accuracy, and then accelerated through hardware accelerators trying to meet various system design targets on power, energy, speed, and cost. However, this approach often does not work well because it ignores the physical constraints that the hardware architectures themselves would have towards the deep neural network (DNN) algorithm design and deployment, especially for the DNNs that will be deployed unto edge devices. Thus, an ideal scenario is that algorithms and their hardware accelerators are developed simultaneously. In this talk, we will present our DNN/Accelerator co-design and co-search methods. Our results have shown great promises for delivering high-performance hardware-tailored DNNs and DNNtailored accelerators naturally and elegantly. One of the DNN models coming out of this co-design method, called SkyNet, won a double championship in the competitive DAC System Design Contest for both the GPU and the FPGA tracks for low-power object detection.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In a conventional top-down design flow, deep-learning algorithms are first designed concentrating on the model accuracy, and then accelerated through hardware accelerators trying to meet various system design targets on power, energy, speed, and cost. However, this approach often does not work well because it ignores the physical constraints that the hardware architectures themselves would have towards the deep neural network (DNN) algorithm design and deployment, especially for the DNNs that will be deployed unto edge devices. Thus, an ideal scenario is that algorithms and their hardware accelerators are developed simultaneously. In this talk, we will present our DNN/Accelerator co-design and co-search methods. Our results have shown great promises for delivering high-performance hardware-tailored DNNs and DNNtailored accelerators naturally and elegantly. One of the DNN models coming out of this co-design method, called SkyNet, won a double championship in the competitive DAC System Design Contest for both the GPU and the FPGA tracks for low-power object detection.