{"title":"Developing a Deep Learning-enabled Guide for the Visually Impaired","authors":"Allen Shelton, T. Ogunfunmi","doi":"10.1109/GHTC46280.2020.9342873","DOIUrl":null,"url":null,"abstract":"With visual impairment being a major detriment to quality of life, It is worth exploring options available to them to improve that quality. In this paper we propose using deep learning, real-time object recognition, and text-to-speech capabilities to develop an application to aid the visually impaired. Learning was implemented on the Convolutional Neural Network AlexNet, using two different types of image datasets to recognize both objects and buildings. We integrate a video webcam with our trained model to recognize objects in real-time so the visually impaired will be able to perceive their environment. Finally, using text-to-speech, our application audibly speaks what our trained model recognizes so they will know what's around them. After obtaining initial results from retrained AlexNet, we attempted two modifications of the original architecture to improve its performance for our application for image recognition for the visually impaired, the first change being to the fully connected layers and the second change being to the convolutional layers. Our results show recognition of 92% for internal object data and 88% for external object data. This will go a long way to achieve UN SDG3 goals for good health and well-being for a large percentage of visually impaired people worldwide.","PeriodicalId":314837,"journal":{"name":"2020 IEEE Global Humanitarian Technology Conference (GHTC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Global Humanitarian Technology Conference (GHTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GHTC46280.2020.9342873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
With visual impairment being a major detriment to quality of life, It is worth exploring options available to them to improve that quality. In this paper we propose using deep learning, real-time object recognition, and text-to-speech capabilities to develop an application to aid the visually impaired. Learning was implemented on the Convolutional Neural Network AlexNet, using two different types of image datasets to recognize both objects and buildings. We integrate a video webcam with our trained model to recognize objects in real-time so the visually impaired will be able to perceive their environment. Finally, using text-to-speech, our application audibly speaks what our trained model recognizes so they will know what's around them. After obtaining initial results from retrained AlexNet, we attempted two modifications of the original architecture to improve its performance for our application for image recognition for the visually impaired, the first change being to the fully connected layers and the second change being to the convolutional layers. Our results show recognition of 92% for internal object data and 88% for external object data. This will go a long way to achieve UN SDG3 goals for good health and well-being for a large percentage of visually impaired people worldwide.