Real-Time, CNN-Based Assistive Device for Visually Impaired People

2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) Pub Date : 2021-10-23 DOI:10.1109/CISP-BMEI53629.2021.9624387

Khaled Jouini, Mohamed Hédi Maâloul, O. Korbaa

{"title":"Real-Time, CNN-Based Assistive Device for Visually Impaired People","authors":"Khaled Jouini, Mohamed Hédi Maâloul, O. Korbaa","doi":"10.1109/CISP-BMEI53629.2021.9624387","DOIUrl":null,"url":null,"abstract":"Visual impairment limits people's ability to move about unaided and interact with the surrounding world. This paper aims to leverage recent advances in deep learning to assist visually impaired people in their daily challenges. The high accuracy of deep learning comes at the expense of high computational requirements for both the training and the inference phases. To meet the computational requirements of deep learning, a common approach is to move data from the assistive device to distant servers (i.e. cloud-based inference). Such data movement requires a fast and active network connection and raises latency, cost, and privacy issues. In contrast with most of exiting assistive devices, in our work we move the computation to where data resides and opt for an approach where inference is performed directly “on” device (i.e. on-device-based inference). Running state-of-the-art deep learning models for a real-time inference on devices with limited resources is a challenging problem that cannot be solved without trading accuracy for speed (no free lunch). In this paper we conduct an extensive experimental study of 12 state-of-the-art object detectors, to strike the best trade-off between speed and accuracy. Our experimental study shows that by choosing the right models, frameworks, and compression techniques, we can achieve decent inference speed with very low accuracy drop.","PeriodicalId":131256,"journal":{"name":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISP-BMEI53629.2021.9624387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Visual impairment limits people's ability to move about unaided and interact with the surrounding world. This paper aims to leverage recent advances in deep learning to assist visually impaired people in their daily challenges. The high accuracy of deep learning comes at the expense of high computational requirements for both the training and the inference phases. To meet the computational requirements of deep learning, a common approach is to move data from the assistive device to distant servers (i.e. cloud-based inference). Such data movement requires a fast and active network connection and raises latency, cost, and privacy issues. In contrast with most of exiting assistive devices, in our work we move the computation to where data resides and opt for an approach where inference is performed directly “on” device (i.e. on-device-based inference). Running state-of-the-art deep learning models for a real-time inference on devices with limited resources is a challenging problem that cannot be solved without trading accuracy for speed (no free lunch). In this paper we conduct an extensive experimental study of 12 state-of-the-art object detectors, to strike the best trade-off between speed and accuracy. Our experimental study shows that by choosing the right models, frameworks, and compression techniques, we can achieve decent inference speed with very low accuracy drop.

查看原文本刊更多论文

实时，基于cnn的视障人士辅助设备

视力障碍限制了人们在没有帮助的情况下移动和与周围世界互动的能力。本文旨在利用深度学习的最新进展来帮助视障人士应对日常挑战。深度学习的高精度是以训练和推理阶段的高计算需求为代价的。为了满足深度学习的计算需求，一种常见的方法是将数据从辅助设备移动到远程服务器(即基于云的推理)。这样的数据移动需要快速且活跃的网络连接，并且会增加延迟、成本和隐私问题。与大多数现有的辅助设备相比，在我们的工作中，我们将计算移动到数据所在的位置，并选择直接在设备上执行推理的方法(即基于设备的推理)。在资源有限的设备上运行最先进的深度学习模型进行实时推理是一个具有挑战性的问题，如果不以准确性换取速度(没有免费的午餐)，就无法解决这个问题。在本文中，我们对12个最先进的目标探测器进行了广泛的实验研究，以达到速度和精度之间的最佳平衡。我们的实验研究表明，通过选择合适的模型、框架和压缩技术，我们可以在非常低的精度下降的情况下获得不错的推理速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)

自引率

0.00%

发文量