使用YOLO和MTCNN的视障人士辅助模型

Proceedings of the 3rd International Conference on Cryptography, Security and Privacy Pub Date : 2019-01-19 DOI:10.1145/3309074.3309114

F. Rahman, Israt Jahan Ritun, Nafisa Farhin, J. Uddin

{"title":"使用YOLO和MTCNN的视障人士辅助模型","authors":"F. Rahman, Israt Jahan Ritun, Nafisa Farhin, J. Uddin","doi":"10.1145/3309074.3309114","DOIUrl":null,"url":null,"abstract":"Visually impaired people face difficulties in safe and independent movement which deprive them from regular professional and social activities in both indoors and outdoors. Similarly they have distressin identification of surrounding environment fundamentals. This paper presents a model to detect brightness and major colors in real-time image by using RGB method by means of an external camera and then identification of fundamental objects as well as facial recognition from personal dataset. For the Object identification and Facial Recognition, YOLO Algorithm and MTCNN Networking are used, respectively. The software support is achieved by using OpenCV libraries of Python as well as implementing machine learning process. The major processor used for our model, Raspberry Pi scans and detects the facial edges via Pi camera and objects in the image are captured and recognized using mobile camera. Image recognition results are transferred to the blind users by means of text-to-speech library. The device portability is achieved by using a battery. The object detection process achieved 6-7 FPS processing with an accuracy rate of 63-80%. The face identification process achieved 80-100% accuracy.","PeriodicalId":430283,"journal":{"name":"Proceedings of the 3rd International Conference on Cryptography, Security and Privacy","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"An assistive model for visually impaired people using YOLO and MTCNN\",\"authors\":\"F. Rahman, Israt Jahan Ritun, Nafisa Farhin, J. Uddin\",\"doi\":\"10.1145/3309074.3309114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visually impaired people face difficulties in safe and independent movement which deprive them from regular professional and social activities in both indoors and outdoors. Similarly they have distressin identification of surrounding environment fundamentals. This paper presents a model to detect brightness and major colors in real-time image by using RGB method by means of an external camera and then identification of fundamental objects as well as facial recognition from personal dataset. For the Object identification and Facial Recognition, YOLO Algorithm and MTCNN Networking are used, respectively. The software support is achieved by using OpenCV libraries of Python as well as implementing machine learning process. The major processor used for our model, Raspberry Pi scans and detects the facial edges via Pi camera and objects in the image are captured and recognized using mobile camera. Image recognition results are transferred to the blind users by means of text-to-speech library. The device portability is achieved by using a battery. The object detection process achieved 6-7 FPS processing with an accuracy rate of 63-80%. The face identification process achieved 80-100% accuracy.\",\"PeriodicalId\":430283,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Cryptography, Security and Privacy\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Cryptography, Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3309074.3309114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Cryptography, Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3309074.3309114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

视障者在安全和独立行动方面面临困难，这使他们无法在室内和室外进行正常的专业和社会活动。同样，他们对周围环境的基本原理也有痛苦的认识。本文提出了一种利用RGB方法，利用外接相机对实时图像进行亮度和主色检测，然后对基本目标进行识别，并对个人数据集进行人脸识别的模型。对于目标识别和人脸识别，分别使用了YOLO算法和MTCNN网络。软件支持通过使用Python的OpenCV库以及实现机器学习过程来实现。我们的模型使用的主要处理器树莓派通过Pi相机扫描和检测面部边缘，使用移动相机捕获和识别图像中的物体。图像识别结果通过文本转语音库的方式传递给盲人用户。该设备的便携性是通过使用电池来实现的。目标检测过程实现了6-7 FPS的处理，准确率为63-80%。人脸识别过程达到了80-100%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An assistive model for visually impaired people using YOLO and MTCNN

Visually impaired people face difficulties in safe and independent movement which deprive them from regular professional and social activities in both indoors and outdoors. Similarly they have distressin identification of surrounding environment fundamentals. This paper presents a model to detect brightness and major colors in real-time image by using RGB method by means of an external camera and then identification of fundamental objects as well as facial recognition from personal dataset. For the Object identification and Facial Recognition, YOLO Algorithm and MTCNN Networking are used, respectively. The software support is achieved by using OpenCV libraries of Python as well as implementing machine learning process. The major processor used for our model, Raspberry Pi scans and detects the facial edges via Pi camera and objects in the image are captured and recognized using mobile camera. Image recognition results are transferred to the blind users by means of text-to-speech library. The device portability is achieved by using a battery. The object detection process achieved 6-7 FPS processing with an accuracy rate of 63-80%. The face identification process achieved 80-100% accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 3rd International Conference on Cryptography, Security and Privacy

自引率

0.00%

发文量