{"title":"AI Based Automated Image Caption Tool Implementation for Visually Impaired","authors":"Vanshika Wadhwa, Bhoomi Gupta, Sachin Gupta","doi":"10.1109/ICIERA53202.2021.9726759","DOIUrl":null,"url":null,"abstract":"Image captioning is a rapidly emerging area in the Artificial Intelligence applications for natural language definitions. It works at the confluence of image data obtained through datasets, and the sentence definitions towards capturing meaningful interpretations of the interaction that exists between them. It uses CNN's (Convolutional Neural Networks) reading techniques in image and LSTM (Long Short Term Memory) type RNN (Recurrent Neural Network) over sentences together so that the computer can see the context of the image and display it in a natural language like English. This paper combines the application of computer vision and natural language processing towards building assistive technology that supplements visual data like images by providing braille readable captions for the visually impaired to get a better sense of what is happening around them and understand their surroundings.","PeriodicalId":220461,"journal":{"name":"2021 International Conference on Industrial Electronics Research and Applications (ICIERA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Industrial Electronics Research and Applications (ICIERA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIERA53202.2021.9726759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Image captioning is a rapidly emerging area in the Artificial Intelligence applications for natural language definitions. It works at the confluence of image data obtained through datasets, and the sentence definitions towards capturing meaningful interpretations of the interaction that exists between them. It uses CNN's (Convolutional Neural Networks) reading techniques in image and LSTM (Long Short Term Memory) type RNN (Recurrent Neural Network) over sentences together so that the computer can see the context of the image and display it in a natural language like English. This paper combines the application of computer vision and natural language processing towards building assistive technology that supplements visual data like images by providing braille readable captions for the visually impaired to get a better sense of what is happening around them and understand their surroundings.