眼睛:使用改进的YOLOv5l算法的视觉障碍人士的轻量级移动应用程序

Q1 Multidisciplinary
Kalaiarasi Sonai Muthu Anbananthen, Sridevi Subbiah, Subiksha Gayathri Baskar, Ratchana Selvaraj, Jayakumar Krishnan, Subarmaniam Kannan, Deisy Chelliah
{"title":"眼睛:使用改进的YOLOv5l算法的视觉障碍人士的轻量级移动应用程序","authors":"Kalaiarasi Sonai Muthu Anbananthen, Sridevi Subbiah, Subiksha Gayathri Baskar, Ratchana Selvaraj, Jayakumar Krishnan, Subarmaniam Kannan, Deisy Chelliah","doi":"10.28991/esj-2023-07-05-011","DOIUrl":null,"url":null,"abstract":"The eye is an essential sensory organ that allows us to perceive our surroundings at a glance. Losing this sense can result in numerous challenges in daily life. However, society is designed for the majority, which can create even more difficulties for visually impaired individuals. Therefore, empowering them and promoting self-reliance are crucial. To address this need, we propose a new Android application called “The Eye” that utilizes Machine Learning (ML)-based object detection techniques to recognize objects in real-time using a smartphone camera or a camera attached to a stick. The article proposed an improved YOLOv5l algorithm to improve object detection in visual applications. YOLOv5l has a larger model size and captures more complex features and details, leading to enhanced object detection accuracy compared to smaller variants like YOLOv5s and YOLOv5m. The primary enhancement in the improved YOLOv5l algorithm is integrating L1 and L2 regularization techniques. These techniques prevent overfitting and improve generalization by adding a regularization term to the loss function during training. Our approach combines image processing and text-to-speech conversion modules to produce reliable results. The Android text-to-speech module is then used to convert the object recognition results into an audio output. According to the experimental results, the improved YOLOv5l has higher detection accuracy than the original YOLOv5 and can detect small, multiple, and overlapped targets with higher accuracy. This study contributes to the advancement of technology to help visually impaired individuals become more self-sufficient and confident. Doi: 10.28991/ESJ-2023-07-05-011 Full Text: PDF","PeriodicalId":11586,"journal":{"name":"Emerging Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Eye: A Light Weight Mobile Application for Visually Challenged People Using Improved YOLOv5l Algorithm\",\"authors\":\"Kalaiarasi Sonai Muthu Anbananthen, Sridevi Subbiah, Subiksha Gayathri Baskar, Ratchana Selvaraj, Jayakumar Krishnan, Subarmaniam Kannan, Deisy Chelliah\",\"doi\":\"10.28991/esj-2023-07-05-011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The eye is an essential sensory organ that allows us to perceive our surroundings at a glance. Losing this sense can result in numerous challenges in daily life. However, society is designed for the majority, which can create even more difficulties for visually impaired individuals. Therefore, empowering them and promoting self-reliance are crucial. To address this need, we propose a new Android application called “The Eye” that utilizes Machine Learning (ML)-based object detection techniques to recognize objects in real-time using a smartphone camera or a camera attached to a stick. The article proposed an improved YOLOv5l algorithm to improve object detection in visual applications. YOLOv5l has a larger model size and captures more complex features and details, leading to enhanced object detection accuracy compared to smaller variants like YOLOv5s and YOLOv5m. The primary enhancement in the improved YOLOv5l algorithm is integrating L1 and L2 regularization techniques. These techniques prevent overfitting and improve generalization by adding a regularization term to the loss function during training. Our approach combines image processing and text-to-speech conversion modules to produce reliable results. The Android text-to-speech module is then used to convert the object recognition results into an audio output. According to the experimental results, the improved YOLOv5l has higher detection accuracy than the original YOLOv5 and can detect small, multiple, and overlapped targets with higher accuracy. This study contributes to the advancement of technology to help visually impaired individuals become more self-sufficient and confident. Doi: 10.28991/ESJ-2023-07-05-011 Full Text: PDF\",\"PeriodicalId\":11586,\"journal\":{\"name\":\"Emerging Science Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Emerging Science Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.28991/esj-2023-07-05-011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Multidisciplinary\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Emerging Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.28991/esj-2023-07-05-011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0

摘要

眼睛是一个重要的感觉器官,它使我们一眼就能感知周围的环境。失去这种感觉会给日常生活带来许多挑战。然而,社会是为大多数人设计的,这可能会给视障人士带来更多的困难。因此,赋予他们权力和促进自力更生至关重要。为了满足这一需求,我们提出了一款名为“The Eye”的新Android应用程序,它利用基于机器学习(ML)的对象检测技术,使用智能手机摄像头或连接在棍子上的摄像头实时识别物体。本文提出了一种改进的YOLOv5l算法来改进视觉应用中的目标检测。与YOLOv5s和YOLOv5m等较小的变体相比,YOLOv5l具有更大的模型尺寸并捕获更复杂的特征和细节,从而提高了目标检测精度。改进的YOLOv5l算法的主要增强是集成L1和L2正则化技术。这些技术防止过拟合,并通过在训练过程中向损失函数添加正则化项来提高泛化。我们的方法结合了图像处理和文本到语音转换模块,以产生可靠的结果。然后使用Android文本到语音模块将对象识别结果转换为音频输出。实验结果表明,改进后的YOLOv5l比原来的YOLOv5具有更高的检测精度,能够以更高的精度检测小目标、多目标和重叠目标。这项研究有助于技术的进步,帮助视障人士变得更加自给自足和自信。Doi: 10.28991/ESJ-2023-07-05-011全文:PDF
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Eye: A Light Weight Mobile Application for Visually Challenged People Using Improved YOLOv5l Algorithm
The eye is an essential sensory organ that allows us to perceive our surroundings at a glance. Losing this sense can result in numerous challenges in daily life. However, society is designed for the majority, which can create even more difficulties for visually impaired individuals. Therefore, empowering them and promoting self-reliance are crucial. To address this need, we propose a new Android application called “The Eye” that utilizes Machine Learning (ML)-based object detection techniques to recognize objects in real-time using a smartphone camera or a camera attached to a stick. The article proposed an improved YOLOv5l algorithm to improve object detection in visual applications. YOLOv5l has a larger model size and captures more complex features and details, leading to enhanced object detection accuracy compared to smaller variants like YOLOv5s and YOLOv5m. The primary enhancement in the improved YOLOv5l algorithm is integrating L1 and L2 regularization techniques. These techniques prevent overfitting and improve generalization by adding a regularization term to the loss function during training. Our approach combines image processing and text-to-speech conversion modules to produce reliable results. The Android text-to-speech module is then used to convert the object recognition results into an audio output. According to the experimental results, the improved YOLOv5l has higher detection accuracy than the original YOLOv5 and can detect small, multiple, and overlapped targets with higher accuracy. This study contributes to the advancement of technology to help visually impaired individuals become more self-sufficient and confident. Doi: 10.28991/ESJ-2023-07-05-011 Full Text: PDF
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Emerging Science Journal
Emerging Science Journal Multidisciplinary-Multidisciplinary
CiteScore
5.40
自引率
0.00%
发文量
155
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信