Philip Colangelo, Oren Segal, Alexander Speicher, M. Margala
{"title":"基于进化算法的人工神经网络和加速器协同设计","authors":"Philip Colangelo, Oren Segal, Alexander Speicher, M. Margala","doi":"10.1109/HPEC.2019.8916533","DOIUrl":null,"url":null,"abstract":"Multilayer feed-forward Artificial Neural Networks (ANNs) are universal function approximators capable of modeling measurable functions to any desired degree of accuracy. In practice, designing practical, efficient neural network architectures requires significant effort and expertise. Further, designing efficient neural network architectures that fit optimally on hardware for the benefit of acceleration adds yet another degree of complexity. In this paper, we use Evolutionary Cell Aided Design (ECAD), a framework capable of searching the design spaces for ANN structures and reconfigurable hardware to find solutions based on a set of constraints and fitness functions. Providing a modular and scalable 2D systolic array based machine learning accelerator design built for an Arria 10 GX 1150 FPGA device using OpenCL enables results to be tested and deployed in real hardware. Along with the hardware, a software model of the architecture was developed to speed up the evolutionary process. We present results from the ECAD framework showing the effect various optimizations including accuracy, images per second, effective giga-operations per second, and latency have on both ANN and hardware configurations. Through this work we show that unique solutions can exist for each optimization resulting in the best performance. This work lays the foundation for finding machine learning based solutions for a wide range of applications having different system constraints.","PeriodicalId":184253,"journal":{"name":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Artificial Neural Network and Accelerator Co-design using Evolutionary Algorithms\",\"authors\":\"Philip Colangelo, Oren Segal, Alexander Speicher, M. Margala\",\"doi\":\"10.1109/HPEC.2019.8916533\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multilayer feed-forward Artificial Neural Networks (ANNs) are universal function approximators capable of modeling measurable functions to any desired degree of accuracy. In practice, designing practical, efficient neural network architectures requires significant effort and expertise. Further, designing efficient neural network architectures that fit optimally on hardware for the benefit of acceleration adds yet another degree of complexity. In this paper, we use Evolutionary Cell Aided Design (ECAD), a framework capable of searching the design spaces for ANN structures and reconfigurable hardware to find solutions based on a set of constraints and fitness functions. Providing a modular and scalable 2D systolic array based machine learning accelerator design built for an Arria 10 GX 1150 FPGA device using OpenCL enables results to be tested and deployed in real hardware. Along with the hardware, a software model of the architecture was developed to speed up the evolutionary process. We present results from the ECAD framework showing the effect various optimizations including accuracy, images per second, effective giga-operations per second, and latency have on both ANN and hardware configurations. Through this work we show that unique solutions can exist for each optimization resulting in the best performance. This work lays the foundation for finding machine learning based solutions for a wide range of applications having different system constraints.\",\"PeriodicalId\":184253,\"journal\":{\"name\":\"2019 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC.2019.8916533\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2019.8916533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Artificial Neural Network and Accelerator Co-design using Evolutionary Algorithms
Multilayer feed-forward Artificial Neural Networks (ANNs) are universal function approximators capable of modeling measurable functions to any desired degree of accuracy. In practice, designing practical, efficient neural network architectures requires significant effort and expertise. Further, designing efficient neural network architectures that fit optimally on hardware for the benefit of acceleration adds yet another degree of complexity. In this paper, we use Evolutionary Cell Aided Design (ECAD), a framework capable of searching the design spaces for ANN structures and reconfigurable hardware to find solutions based on a set of constraints and fitness functions. Providing a modular and scalable 2D systolic array based machine learning accelerator design built for an Arria 10 GX 1150 FPGA device using OpenCL enables results to be tested and deployed in real hardware. Along with the hardware, a software model of the architecture was developed to speed up the evolutionary process. We present results from the ECAD framework showing the effect various optimizations including accuracy, images per second, effective giga-operations per second, and latency have on both ANN and hardware configurations. Through this work we show that unique solutions can exist for each optimization resulting in the best performance. This work lays the foundation for finding machine learning based solutions for a wide range of applications having different system constraints.