基于深度神经网络的视障自然场景文本到语音信号的转换

R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.
{"title":"基于深度神经网络的视障自然场景文本到语音信号的转换","authors":"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.","doi":"10.1109/i-PACT52855.2021.9696523","DOIUrl":null,"url":null,"abstract":"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.","PeriodicalId":335956,"journal":{"name":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network\",\"authors\":\"R. Kapoor, M. Sushama, Bhavani Reddy Andem, Akhila Sri Phani Sai Sindhura S.\",\"doi\":\"10.1109/i-PACT52855.2021.9696523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.\",\"PeriodicalId\":335956,\"journal\":{\"name\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/i-PACT52855.2021.9696523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Innovations in Power and Advanced Computing Technologies (i-PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/i-PACT52855.2021.9696523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

深度神经网络架构可以用来检测和识别自然图像或相机图像中的文本。如果这段文字可以转换成语音信号,这将对有部分视力或没有视力的人非常有帮助。本文利用深度神经网络的VGG结构和选择性搜索分割来进行文本检测。然后利用Py-Tesseract最优字符识别器来识别检测到的文本,然后将其转换为语音信号。该系统有助于识别路边或走廊板,从而为有特殊需要的人的生活增加一些独立性。该系统可以修改为在附近位置的特定文本的搜索模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Natural Scene Text to Voice Signal Conversion for Visually Impaired using Deep Neural Network
The deep neural network architectures can be utilized to detect and recognize the text in natural or camera images. If this text can be converted to voice signals it will be very helpful for persons with partial or no vision. Selective search-based segmentation along with VGG architecture of Deep neural networks is utilized in this work for text detection. The Py-Tesseract Optimal character recognizer is then utilized for recognizing the detected text, which is then converted to voice signals. The system can beneficial for recognizing road side or corridor boards, thus adding some independence to the life of people with special needs. The system can be modified with a search mode for a particular text in the nearby location.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信