Mask-Net: A Hardware-efficient Object Detection Network with Masked Region Proposals

Han-Chen Chen, Cong Hao
{"title":"Mask-Net: A Hardware-efficient Object Detection Network with Masked Region Proposals","authors":"Han-Chen Chen, Cong Hao","doi":"10.1109/ASAP54787.2022.00030","DOIUrl":null,"url":null,"abstract":"Object detection on embedded systems is challenging because it is hard to achieve real-time inference with low energy consumption and limited hardware resources. Another challenge is to find hardware-friendly methods to avoid redundant computation. To address these challenges, in this work, we propose Mask-Net, a hardware-efficient object detection network with masked region proposals in regular shapes. First, we propose a hardware-friendly region proposal method to avoid redundant computation as much as possible and as early as possible, with slight or no accuracy loss. Second, we demonstrate that our method is generalizable by applying it to several detection backbones including SkyNet, ResNet-18 and UltraNet. Our method performs well in different scenarios, including DAC-SDC dataset, UAV123 dataset and OTB100 dataset. We choose SkyNet as our base model to design an accelerator and verify our design on Xilinx ZCU106 FPGA. We observe a speedup of 1.3× and about 30% energy consumption reduction when the FPGA runs at different frequencies from 124 MHz to 214 MHz with only a slight accuracy loss. We also conduct a design space exploration and demonstrate that our accelerator can achieve a theoretical speedup of 1.76× with masked region proposals. This is achieved by optimally allocating DSPs to different parts of the accelerator to balance the computations before and after the mask.","PeriodicalId":207871,"journal":{"name":"2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP54787.2022.00030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Object detection on embedded systems is challenging because it is hard to achieve real-time inference with low energy consumption and limited hardware resources. Another challenge is to find hardware-friendly methods to avoid redundant computation. To address these challenges, in this work, we propose Mask-Net, a hardware-efficient object detection network with masked region proposals in regular shapes. First, we propose a hardware-friendly region proposal method to avoid redundant computation as much as possible and as early as possible, with slight or no accuracy loss. Second, we demonstrate that our method is generalizable by applying it to several detection backbones including SkyNet, ResNet-18 and UltraNet. Our method performs well in different scenarios, including DAC-SDC dataset, UAV123 dataset and OTB100 dataset. We choose SkyNet as our base model to design an accelerator and verify our design on Xilinx ZCU106 FPGA. We observe a speedup of 1.3× and about 30% energy consumption reduction when the FPGA runs at different frequencies from 124 MHz to 214 MHz with only a slight accuracy loss. We also conduct a design space exploration and demonstrate that our accelerator can achieve a theoretical speedup of 1.76× with masked region proposals. This is achieved by optimally allocating DSPs to different parts of the accelerator to balance the computations before and after the mask.
Mask-Net:一种具有屏蔽区域的硬件高效目标检测网络
嵌入式系统的目标检测具有挑战性,因为在低能耗和有限的硬件资源下难以实现实时推理。另一个挑战是找到硬件友好的方法来避免冗余计算。为了解决这些挑战,在这项工作中,我们提出了Mask-Net,这是一种硬件高效的目标检测网络,具有规则形状的掩模区域建议。首先,我们提出了一种硬件友好的区域建议方法,以尽可能早地避免冗余计算,并且精度损失很小或没有损失。其次,通过将该方法应用于SkyNet、ResNet-18和UltraNet等多个检测骨干网,证明了该方法的通用性。我们的方法在DAC-SDC数据集、UAV123数据集和OTB100数据集的不同场景下都表现良好。我们选择SkyNet作为基础模型来设计加速器,并在Xilinx ZCU106 FPGA上验证了我们的设计。我们观察到,当FPGA在124 MHz到214 MHz的不同频率上运行时,速度提高了1.3倍,能耗降低了约30%,只有轻微的精度损失。我们还进行了设计空间探索,并证明我们的加速器可以实现1.76倍的理论加速。这是通过将dsp最佳地分配到加速器的不同部分来平衡掩码前后的计算来实现的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信