A 58.6mW real-time programmable object detector with multi-scale multi-object support using deformable parts model on 1920×1080 video at 30fps

Amr Suleiman, Zhengdong Zhang, V. Sze
{"title":"A 58.6mW real-time programmable object detector with multi-scale multi-object support using deformable parts model on 1920×1080 video at 30fps","authors":"Amr Suleiman, Zhengdong Zhang, V. Sze","doi":"10.1109/VLSIC.2016.7573528","DOIUrl":null,"url":null,"abstract":"This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.","PeriodicalId":6512,"journal":{"name":"2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits)","volume":"5 1","pages":"1-2"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIC.2016.7573528","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.
一个58.6mW的实时可编程目标探测器,支持多尺度多目标,使用可变形部件模型,在1920×1080视频中以30fps的速度运行
本文提出了一种基于可变形零件模型(DPM)的可编程、节能、实时目标检测加速器,其精度比传统刚体模型提高2倍。在8个可变形零件检测的情况下,采用三种方法解决计算复杂度高的问题:分类剪枝方法减少33倍的零件分类,矢量量化方法减少15倍的内存大小,特征基投影方法减少2倍的分类成本。该芯片采用65nm CMOS技术,无需任何片外存储即可以30fps的速度处理高清(1920×1080)图像,功耗仅为58.6mW (0.94nJ/pixel, 1168 GOPS/W)。该芯片有两个分类引擎,可以同时检测两种不同类别的物体。经过测试的高吞吐量为60fps,分类引擎可以进行时间复用,以检测两个以上的对象类别。它可以通过改变修剪因子或禁用部件分类来扩展能量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信