KP2Dtiny: Quantized Neural Keypoint Detection and Description on the Edge

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS) Pub Date : 2023-06-11 DOI:10.1109/AICAS57966.2023.10168598

Thomas Rüegg, Marco Giordano, Michele Magno

{"title":"KP2Dtiny: Quantized Neural Keypoint Detection and Description on the Edge","authors":"Thomas Rüegg, Marco Giordano, Michele Magno","doi":"10.1109/AICAS57966.2023.10168598","DOIUrl":null,"url":null,"abstract":"Detection and description of keypoints in images is a fundamental component of a wide range of tasks such as Simultaneous Localization And Mapping (SLAM), image alignment and structure from motion (SfM). Efficient computation of these features is crucial for real-time applications and has been addressed by multiple handcrafted algorithms and, recently, by deep neural network-based detectors. Learned detectors achieve high detection performance, but pose high computational requirements, making them slow and impractical for low-power resource constraint platforms. This paper presents a quantized neural keypoint detector and descriptor optimized for edge devices exploiting two recent AI platforms such as MAX78000 by Analog Devices and the Coral AI USB accelerator from Google. To accommodate the diverse constraints and requirements of various applications, we propose and evaluate two model architectures (KP2DtinySmall and KP2DtinyFast) and deploy them on the aforementioned platforms using full 8-bit integer quantization. Furthermore, we extensively evaluate these models in terms of power, latency and accuracy, reporting results on three image sizes (88x88, 320x240 and 640x480), evaluating both quantized and non-quantized models. Fully quantized, KP2DtinySmall reduces network size by a factor of 54x while improving homographic estimation accuracy on 88x88 images on the most stringent threshold (Correctness d1) by 32.4% (0.550) and on 320x240 images by 10.7% (0.648) compared to the KeypointNet architecture by Yang You et. al. This result is achieved by designing a new network with low power platforms in mind, particularly addressing the lower resolution by increasing the density of detectable features. Deployed on the MAX78000 MCU, inference of low-resolution images is run at 59 FPS, consuming 1.1 mJ per image. On the Coral usb accelerator, KP2DtinyFast runs inference on low-resolution images at 527 FPS consuming 3.1 mJ, on high resolution it achieves 70 FPS at 19.9 mJ per inference.","PeriodicalId":296649,"journal":{"name":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICAS57966.2023.10168598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Detection and description of keypoints in images is a fundamental component of a wide range of tasks such as Simultaneous Localization And Mapping (SLAM), image alignment and structure from motion (SfM). Efficient computation of these features is crucial for real-time applications and has been addressed by multiple handcrafted algorithms and, recently, by deep neural network-based detectors. Learned detectors achieve high detection performance, but pose high computational requirements, making them slow and impractical for low-power resource constraint platforms. This paper presents a quantized neural keypoint detector and descriptor optimized for edge devices exploiting two recent AI platforms such as MAX78000 by Analog Devices and the Coral AI USB accelerator from Google. To accommodate the diverse constraints and requirements of various applications, we propose and evaluate two model architectures (KP2DtinySmall and KP2DtinyFast) and deploy them on the aforementioned platforms using full 8-bit integer quantization. Furthermore, we extensively evaluate these models in terms of power, latency and accuracy, reporting results on three image sizes (88x88, 320x240 and 640x480), evaluating both quantized and non-quantized models. Fully quantized, KP2DtinySmall reduces network size by a factor of 54x while improving homographic estimation accuracy on 88x88 images on the most stringent threshold (Correctness d1) by 32.4% (0.550) and on 320x240 images by 10.7% (0.648) compared to the KeypointNet architecture by Yang You et. al. This result is achieved by designing a new network with low power platforms in mind, particularly addressing the lower resolution by increasing the density of detectable features. Deployed on the MAX78000 MCU, inference of low-resolution images is run at 59 FPS, consuming 1.1 mJ per image. On the Coral usb accelerator, KP2DtinyFast runs inference on low-resolution images at 527 FPS consuming 3.1 mJ, on high resolution it achieves 70 FPS at 19.9 mJ per inference.

查看原文本刊更多论文

KP2Dtiny:边缘的量化神经关键点检测与描述

图像中关键点的检测和描述是广泛任务的基本组成部分，例如同步定位和映射(SLAM)，图像对齐和运动结构(SfM)。这些特征的有效计算对于实时应用至关重要，并且已经通过多种手工算法和最近基于深度神经网络的检测器来解决。学习检测器具有较高的检测性能，但对计算量的要求较高，对于低功耗资源约束平台来说速度较慢且不切实际。本文提出了一种针对边缘设备优化的量化神经关键点检测器和描述符，利用了两个最新的人工智能平台，如Analog devices的MAX78000和Google的Coral AI USB加速器。为了适应各种应用的不同约束和需求，我们提出并评估了两种模型架构(KP2DtinySmall和KP2DtinyFast)，并使用完整的8位整数量化将它们部署在上述平台上。此外，我们在功率、延迟和准确性方面广泛评估了这些模型，报告了三种图像尺寸(88x88、320x240和640x480)的结果，评估了量化和非量化模型。与杨佑等人的KeypointNet架构相比，完全量化的KP2DtinySmall将网络大小减少了54倍，同时在最严格的阈值(正确性d1)上将88x88图像的同形估计精度提高了32.4%(0.550)，在320x240图像上提高了10.7%(0.648)。这一结果是通过设计一个考虑低功耗平台的新网络实现的，特别是通过增加可检测特征的密度来解决较低的分辨率。部署在MAX78000 MCU上，低分辨率图像的推理以59 FPS的速度运行，每张图像消耗1.1 mJ。在Coral usb加速器上，KP2DtinyFast以527 FPS的低分辨率图像运行推理，消耗3.1 mJ，在高分辨率下，它以19.9 mJ的每次推理达到70 FPS。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS)

自引率

0.00%

发文量