表征和加速端到端边缘人工智能推理系统的目标检测应用

Yujie Hui, J. Lien, Xiaoyi Lu
{"title":"表征和加速端到端边缘人工智能推理系统的目标检测应用","authors":"Yujie Hui, J. Lien, Xiaoyi Lu","doi":"10.1145/3453142.3491294","DOIUrl":null,"url":null,"abstract":"Modern EdgeAI inference systems still have many cruciallimi-tations. In this paper, we holistically consider implications and optimizations of EdgeAI inference systems for object detection applications in efficiency and accuracy. We summarize three in-trinsic limitations of current-generation EdgeAI inference systems based on our observations (i.e., less compute capabilities, restrictions of operations, and accuracy loss due to numerical precision). Then we propose three approaches to improve end-to-end performance and prediction accuracy: 1) Utilizing parallel computing designs and methods to solve computational bottlenecks; 2) Ap-plying domain-specific optimizations to mostly eliminate accuracy loss; 3) Using higher-quality input data to saturate the processors and accelerators. We also provide five recommendations for end-to-end EdgeAI solution deployments, which are usually neglected by EdgeAI users. In particular, we deploy and optimize two real object detection applications (2D and 3D) on two EdgeAI inference systems (NovuTensor and Nvidia Xavier) with widely used datasets (i.e., MS-COCO, PASCAL-VOC, and KITTI). The results show that runtime performance can be accelerated by up to 2X on NovuTen-sor and the mean average precision (mAP) can be increased by 46% through applying our proposed methods.","PeriodicalId":6779,"journal":{"name":"2021 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"205 1","pages":"01-12"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Characterizing and Accelerating End-to-End EdgeAI Inference Systems for Object Detection Applications\",\"authors\":\"Yujie Hui, J. Lien, Xiaoyi Lu\",\"doi\":\"10.1145/3453142.3491294\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern EdgeAI inference systems still have many cruciallimi-tations. In this paper, we holistically consider implications and optimizations of EdgeAI inference systems for object detection applications in efficiency and accuracy. We summarize three in-trinsic limitations of current-generation EdgeAI inference systems based on our observations (i.e., less compute capabilities, restrictions of operations, and accuracy loss due to numerical precision). Then we propose three approaches to improve end-to-end performance and prediction accuracy: 1) Utilizing parallel computing designs and methods to solve computational bottlenecks; 2) Ap-plying domain-specific optimizations to mostly eliminate accuracy loss; 3) Using higher-quality input data to saturate the processors and accelerators. We also provide five recommendations for end-to-end EdgeAI solution deployments, which are usually neglected by EdgeAI users. In particular, we deploy and optimize two real object detection applications (2D and 3D) on two EdgeAI inference systems (NovuTensor and Nvidia Xavier) with widely used datasets (i.e., MS-COCO, PASCAL-VOC, and KITTI). The results show that runtime performance can be accelerated by up to 2X on NovuTen-sor and the mean average precision (mAP) can be increased by 46% through applying our proposed methods.\",\"PeriodicalId\":6779,\"journal\":{\"name\":\"2021 IEEE/ACM Symposium on Edge Computing (SEC)\",\"volume\":\"205 1\",\"pages\":\"01-12\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACM Symposium on Edge Computing (SEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3453142.3491294\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3453142.3491294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

现代EdgeAI推理系统仍然有许多关键的局限性。在本文中,我们全面考虑了EdgeAI推理系统在效率和准确性方面对目标检测应用的影响和优化。根据我们的观察,我们总结了当前一代EdgeAI推理系统的三个内在局限性(即计算能力较低,操作限制以及由于数值精度而导致的精度损失)。提出了提高端到端性能和预测精度的三种方法:1)利用并行计算设计和方法解决计算瓶颈;2)应用特定领域的优化,主要消除精度损失;3)使用更高质量的输入数据使处理器和加速器饱和。我们还提供了端到端EdgeAI解决方案部署的五个建议,这些建议通常被EdgeAI用户忽略。特别是,我们在两个EdgeAI推理系统(NovuTensor和Nvidia Xavier)上部署和优化了两个真实物体检测应用程序(2D和3D),这些系统具有广泛使用的数据集(即MS-COCO, PASCAL-VOC和KITTI)。结果表明,采用本文提出的方法,在NovuTen-sor上运行时性能可提高2倍,平均精度(mAP)可提高46%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Characterizing and Accelerating End-to-End EdgeAI Inference Systems for Object Detection Applications
Modern EdgeAI inference systems still have many cruciallimi-tations. In this paper, we holistically consider implications and optimizations of EdgeAI inference systems for object detection applications in efficiency and accuracy. We summarize three in-trinsic limitations of current-generation EdgeAI inference systems based on our observations (i.e., less compute capabilities, restrictions of operations, and accuracy loss due to numerical precision). Then we propose three approaches to improve end-to-end performance and prediction accuracy: 1) Utilizing parallel computing designs and methods to solve computational bottlenecks; 2) Ap-plying domain-specific optimizations to mostly eliminate accuracy loss; 3) Using higher-quality input data to saturate the processors and accelerators. We also provide five recommendations for end-to-end EdgeAI solution deployments, which are usually neglected by EdgeAI users. In particular, we deploy and optimize two real object detection applications (2D and 3D) on two EdgeAI inference systems (NovuTensor and Nvidia Xavier) with widely used datasets (i.e., MS-COCO, PASCAL-VOC, and KITTI). The results show that runtime performance can be accelerated by up to 2X on NovuTen-sor and the mean average precision (mAP) can be increased by 46% through applying our proposed methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信