{"title":"Blind Aid: State of the art for Scene Text Detector and Text to Speech","authors":"Srividya Kotagiri, Attada Venkataramana, Gogula Kiran","doi":"10.1109/ICACTA54488.2022.9753094","DOIUrl":null,"url":null,"abstract":"This paper the main focus is on the people who are blind and who cannot see. This prototype leads the blind people to recognize the text before them. The entire paper process of this blind aid. First of all, the blind person will be given with a camera attached to his spectacles. Whenever he wants to read something, he will take a snap of that particular location. Now the text in the image will be detected using an algorithm called EAST (Efficient and Accurate Scene Text Detector) which is an example of FCN with PVANet. In this detection there will be a use of max pooling while feature extraction in images. After detecting the text from image, this project uses Tesseract based OCR Engine to recognize the text in the image. After recognizing the text from the image, the text will be converted to some speech output to the blind person using python package called pytts version 3. The speech converted text will be given as an output to blind person with the aid of speaker. Finally here comes the concept of Modified EAST where the already built in model is extended to increase the accuracy of the prototype or model.","PeriodicalId":345370,"journal":{"name":"2022 International Conference on Advanced Computing Technologies and Applications (ICACTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advanced Computing Technologies and Applications (ICACTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACTA54488.2022.9753094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper the main focus is on the people who are blind and who cannot see. This prototype leads the blind people to recognize the text before them. The entire paper process of this blind aid. First of all, the blind person will be given with a camera attached to his spectacles. Whenever he wants to read something, he will take a snap of that particular location. Now the text in the image will be detected using an algorithm called EAST (Efficient and Accurate Scene Text Detector) which is an example of FCN with PVANet. In this detection there will be a use of max pooling while feature extraction in images. After detecting the text from image, this project uses Tesseract based OCR Engine to recognize the text in the image. After recognizing the text from the image, the text will be converted to some speech output to the blind person using python package called pytts version 3. The speech converted text will be given as an output to blind person with the aid of speaker. Finally here comes the concept of Modified EAST where the already built in model is extended to increase the accuracy of the prototype or model.