Raghid Morcel, Haitham Akkary, Hazem M. Hajj, M. Saghir, A. Keshavamurthy, R. Khanna, H. Artail
{"title":"低端FPGA平台加速卷积神经网络的极简设计","authors":"Raghid Morcel, Haitham Akkary, Hazem M. Hajj, M. Saghir, A. Keshavamurthy, R. Khanna, H. Artail","doi":"10.1109/FCCM.2017.62","DOIUrl":null,"url":null,"abstract":"Deep neural networks have gained tremendous attention in both the academic and industrial communities due to their performance in many artificial intelligence applications, particularly in computer vision. However, these algorithms are known to be computationally very demanding for both scoring and model learning applications. State-of-the-art recognition models use tens of millions of parameters and have significant memory and computational requirements. These requirements have restricted the users of deep neural network applications to high-end, expensive, and power hungry IoT platforms to penetrate the deep learning markets. This paper presents work at the leading edge intersection of several evolving technologies, including emerging IoT platforms, Deep Learning, and Field-programmable Gate Array (FPGA) computing. We demonstrate a new minimalist design methodology that minimizes the utilization of FPGA resources and can run deep learning algorithms with over 60 million parameters. This makes particularly suitable for resource-constrained, low-end FPGA platforms.","PeriodicalId":124631,"journal":{"name":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Minimalist Design for Accelerating Convolutional Neural Networks for Low-End FPGA Platforms\",\"authors\":\"Raghid Morcel, Haitham Akkary, Hazem M. Hajj, M. Saghir, A. Keshavamurthy, R. Khanna, H. Artail\",\"doi\":\"10.1109/FCCM.2017.62\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks have gained tremendous attention in both the academic and industrial communities due to their performance in many artificial intelligence applications, particularly in computer vision. However, these algorithms are known to be computationally very demanding for both scoring and model learning applications. State-of-the-art recognition models use tens of millions of parameters and have significant memory and computational requirements. These requirements have restricted the users of deep neural network applications to high-end, expensive, and power hungry IoT platforms to penetrate the deep learning markets. This paper presents work at the leading edge intersection of several evolving technologies, including emerging IoT platforms, Deep Learning, and Field-programmable Gate Array (FPGA) computing. We demonstrate a new minimalist design methodology that minimizes the utilization of FPGA resources and can run deep learning algorithms with over 60 million parameters. This makes particularly suitable for resource-constrained, low-end FPGA platforms.\",\"PeriodicalId\":124631,\"journal\":{\"name\":\"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"volume\":\"2016 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FCCM.2017.62\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FCCM.2017.62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Minimalist Design for Accelerating Convolutional Neural Networks for Low-End FPGA Platforms
Deep neural networks have gained tremendous attention in both the academic and industrial communities due to their performance in many artificial intelligence applications, particularly in computer vision. However, these algorithms are known to be computationally very demanding for both scoring and model learning applications. State-of-the-art recognition models use tens of millions of parameters and have significant memory and computational requirements. These requirements have restricted the users of deep neural network applications to high-end, expensive, and power hungry IoT platforms to penetrate the deep learning markets. This paper presents work at the leading edge intersection of several evolving technologies, including emerging IoT platforms, Deep Learning, and Field-programmable Gate Array (FPGA) computing. We demonstrate a new minimalist design methodology that minimizes the utilization of FPGA resources and can run deep learning algorithms with over 60 million parameters. This makes particularly suitable for resource-constrained, low-end FPGA platforms.