Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning最新文献

筛选
英文 中文
Battery-Free Camera Occupancy Detection System 无电池摄像头占用检测系统
Ali Saffari, Sin Yong Tan, Mohamad Katanbaf, Homagni Saha, Joshua R. Smith, S. Sarkar
{"title":"Battery-Free Camera Occupancy Detection System","authors":"Ali Saffari, Sin Yong Tan, Mohamad Katanbaf, Homagni Saha, Joshua R. Smith, S. Sarkar","doi":"10.1145/3469116.3470013","DOIUrl":"https://doi.org/10.1145/3469116.3470013","url":null,"abstract":"Occupancy detection systems are commonly equipped with high-quality cameras and a processor with high computational power to run detection algorithms. This paper presents a human occupancy detection system that uses battery-free cameras and a deep learning model implemented on a low-cost hub to detect human presence. Our low-resolution camera harvests energy from ambient light and transmits data to the hub using backscatter communication. We implement the state-of-the-art YOLOv5 network detection algorithm that offers high detection accuracy and fast inferencing speed on a Raspberry Pi 4 Model B. We achieve an inferencing speed of ~ 100ms per image and an overall detection accuracy of >90% with only 2GB CPU RAM on the Raspberry Pi. In the experimental results, we also demonstrate that the detection is robust to noise, illuminance, occlusion, and angle of depression.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126605070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Are Mobile DNN Accelerators Accelerating DNNs? 移动深度神经网络加速器加速深度神经网络吗?
Qingqing Cao, Alexandru Eugen Irimiea, Mohamed Abdelfattah, A. Balasubramanian, N. Lane
{"title":"Are Mobile DNN Accelerators Accelerating DNNs?","authors":"Qingqing Cao, Alexandru Eugen Irimiea, Mohamed Abdelfattah, A. Balasubramanian, N. Lane","doi":"10.1145/3469116.3470011","DOIUrl":"https://doi.org/10.1145/3469116.3470011","url":null,"abstract":"Deep neural networks (DNNs) are running on many mobile and embedded devices with the goal of energy efficiency and highest possible performance. However, DNN workloads are getting more computationally intensive, and simultaneously their deployment is ever-increasing. This has led to the creation of many purpose-built low-power neural accelerators to replace or augment traditional mobile CPUs and GPUs. In this work, we provide an in-depth study of one set of commercially-available mobile accelerators, the Intel Neural Compute Sticks (NCS). We perform a systematic measurement study of the latency and energy of this accelerator under a variety of DNNs including convolutional neural networks (CNNs) for vision tasks and attention-based Transformer models for NLP tasks. We compare to the mobile processors (CPU, GPU, and DSP) on a smartphone and a mobile board. Our study shows commercial mobile accelerators like NCS are not ready yet to provide the performance as claimed. We also point out directions in optimizing the model architectures to better suit these accelerators.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133192537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Towards Ubiquitous Learning: A First Measurement of On-Device Training Performance 走向泛在学习:设备上训练性能的第一个测量
Dongqi Cai, Qipeng Wang, Yuanqiang Liu, Yunxin Liu, Shangguang Wang, Mengwei Xu
{"title":"Towards Ubiquitous Learning: A First Measurement of On-Device Training Performance","authors":"Dongqi Cai, Qipeng Wang, Yuanqiang Liu, Yunxin Liu, Shangguang Wang, Mengwei Xu","doi":"10.1145/3469116.3470009","DOIUrl":"https://doi.org/10.1145/3469116.3470009","url":null,"abstract":"We are witnessing the emergence of ubiquitous learning, where each device (smartphones, wearables, IoTs, etc) can learn from their environments either alone or collaboratively. Such a new paradigm is enabled by deep learning techniques, or more specifically, on-device training. Given its popularity in the machine learning community, unfortunately, there are no systematic understandings of a critical question: how much cost does it take to train typical deep models on commodity end devices? Therefore, this work performs comprehensive measurements of on-device training with the state-of-the-art training library, 6 mobile phones, and 5 classical neural networks. Our measurements report metrics of training time, energy consumption, memory footprint, hardware utilization, and thermal dynamics, thus help reveal a complete landscape of the on-device training performance. The observations from the measurements help guide us to several promising future directions to efficiently enable ubiquitous learning.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123226989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Benchmarking Video Object Detection Systems on Embedded Devices under Resource Contention 资源竞争下嵌入式设备视频目标检测系统的基准测试
Jayoung Lee, Pengcheng Wang, Ran Xu, Venkateswara Dasari, Noah Weston, Yin Li, S. Bagchi, S. Chaterji
{"title":"Benchmarking Video Object Detection Systems on Embedded Devices under Resource Contention","authors":"Jayoung Lee, Pengcheng Wang, Ran Xu, Venkateswara Dasari, Noah Weston, Yin Li, S. Bagchi, S. Chaterji","doi":"10.1145/3469116.3470010","DOIUrl":"https://doi.org/10.1145/3469116.3470010","url":null,"abstract":"Adaptive and efficient computer vision systems have been proposed to make computer vision tasks, e.g., object classification and object detection, optimized for embedded boards or mobile devices. These studies focus on optimizing the model (deep network) or system itself, by designing an efficient network architecture or adapting the network architecture at runtime using approximation knobs, such as image size, type of object tracker, head of the object detector (e.g., lighter-weight heads such as one-shot object detectors like YOLO over two-shot object detectors like FRCNN). In this work, we benchmark different video object detection protocols, including FastAdapt, with respect to accuracy, latency, and energy consumption on three different embedded boards that represent the leading edge mobile GPUs. Our set of protocols consists of Faster R-CNN, YOLOv3, SELSA, MEGA, and REPP. Further, we characterize their performance under different levels of resource contention, specifically GPU contention, as would arise due to co-located applications on these boards, contending with the video object detection task. Our key insights are that object detectors have to be coupled with trackers to keep up with the latency requirements (e.g., 30 fps). With this, FastAdapt achieves up to 76 fps on the most well-resourced NVIDIA Jetson-class board---the NVIDIA AGX Xavier. Second, adaptive protocols like FastAdapt, FRCNN, and YOLO (specifically our adaptive variants, FRCNN+ and YOLO+) work well under resource constraints. Among the latest video object detection heads, SELSA achieves the highest accuracy but at a latency of over 2 sec per frame. Our energy consumption experiments bring out that FastAdapt, adaptive FRCNN, and adaptive YOLO are best-in-class, relative to the non-adaptive protocols SELSA, MEGA, and REPP.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"235 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134072321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Enabling Binary Neural Network Training on the Edge 启用边缘二进制神经网络训练
Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor José Nunes Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, G. Constantinides
{"title":"Enabling Binary Neural Network Training on the Edge","authors":"Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor José Nunes Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, G. Constantinides","doi":"10.1145/3469116.3470015","DOIUrl":"https://doi.org/10.1145/3469116.3470015","url":null,"abstract":",","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116377504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
ParallelFusion ParallelFusion
Jingyu Lee, Yunxin Liu, Youngki Lee
{"title":"ParallelFusion","authors":"Jingyu Lee, Yunxin Liu, Youngki Lee","doi":"10.1145/3469116.3470014","DOIUrl":"https://doi.org/10.1145/3469116.3470014","url":null,"abstract":"Mobile GPUs are extremely under-utilized for DNN computations across different mobile deep learning frameworks and multiple DNNs with various complexities. We explore the feasibility of batching and it improves the throughput by up to 35%. However, real-time applications in mobile have a limited amount of requests to get a benefit from batching. To tackle the challenge, we present ParallelFusion technique that enables concurrent execution of heterogeneous operators to further utilize the mobile GPU. We implemented ParallelFusion over the MNN framework and evaluated on 6 state-of-the-art DNNs. Our evaluation shows that Parallel Fusion achieves up to 195% to 218% throughput with fused execution of 2 and 3 operators compared to single DNN inference.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"43 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120899421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions 早期退出网络的自适应推理:设计、挑战和方向
Stefanos Laskaridis, Alexandros Kouris, N. Lane
{"title":"Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions","authors":"Stefanos Laskaridis, Alexandros Kouris, N. Lane","doi":"10.1145/3469116.3470012","DOIUrl":"https://doi.org/10.1145/3469116.3470012","url":null,"abstract":"DNNs are becoming less and less over-parametrised due to recent advances in efficient model design, through careful hand-crafted or NAS-based methods. Relying on the fact that not all inputs require the same amount of computation to yield a confident prediction, adaptive inference is gaining attention as a prominent approach for pushing the limits of efficient deployment. Particularly, early-exit networks comprise an emerging direction for tailoring the computation depth of each input sample at runtime, offering complementary performance gains to other efficiency optimisations. In this paper, we decompose the design methodology of early-exit networks to its key components and survey the recent advances in each one of them. We also position early-exiting against other efficient inference solutions and provide our insights on the current challenges and most promising future directions for research in the field.","PeriodicalId":162801,"journal":{"name":"Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125630142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信