Advancing human-computer interaction: AI-driven translation of American Sign Language to Nepali using convolutional neural networks and text-to-speech conversion application

Biplov Paneru , Bishwash Paneru , Khem Narayan Poudyal
{"title":"Advancing human-computer interaction: AI-driven translation of American Sign Language to Nepali using convolutional neural networks and text-to-speech conversion application","authors":"Biplov Paneru ,&nbsp;Bishwash Paneru ,&nbsp;Khem Narayan Poudyal","doi":"10.1016/j.sasc.2024.200165","DOIUrl":null,"url":null,"abstract":"<div><div>Advanced technology that serves people with impairments is severely lacking in Nepal, especially when it comes to helping the hearing impaired communicate. Although sign language is one of the oldest and most organic ways to communicate, there aren't many resources available in Nepal to help with the communication gap between Nepali and American Sign Language (ASL). This study investigates the application of Convolutional Neural Networks (CNN) and AI-driven methods for translating ASL into Nepali text and speech to bridge the technical divide. Two pre-trained transfer learning models, ResNet50 and VGG16, were refined to classify ASL signs using extensive ASL image datasets. The system utilizes the Python gTTS package to translate signs into Nepali text and speech, integrating with an OpenCV video input TKinter-based Graphical User Interface (GUI). With both CNN architectures, the model's accuracy of over 99 % allowed for the smooth conversion of ASL to speech output. By providing a workable solution to improve inclusion and communication, the deployment of an AI-driven translation system represents a significant step in lowering the technological obstacles that disabled people in Nepal must overcome.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"6 ","pages":"Article 200165"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941924000942","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Advanced technology that serves people with impairments is severely lacking in Nepal, especially when it comes to helping the hearing impaired communicate. Although sign language is one of the oldest and most organic ways to communicate, there aren't many resources available in Nepal to help with the communication gap between Nepali and American Sign Language (ASL). This study investigates the application of Convolutional Neural Networks (CNN) and AI-driven methods for translating ASL into Nepali text and speech to bridge the technical divide. Two pre-trained transfer learning models, ResNet50 and VGG16, were refined to classify ASL signs using extensive ASL image datasets. The system utilizes the Python gTTS package to translate signs into Nepali text and speech, integrating with an OpenCV video input TKinter-based Graphical User Interface (GUI). With both CNN architectures, the model's accuracy of over 99 % allowed for the smooth conversion of ASL to speech output. By providing a workable solution to improve inclusion and communication, the deployment of an AI-driven translation system represents a significant step in lowering the technological obstacles that disabled people in Nepal must overcome.
推进人机交互:利用卷积神经网络和文本到语音转换应用,实现人工智能驱动的美国手语到尼泊尔语的翻译
尼泊尔严重缺乏为残障人士服务的先进技术,尤其是在帮助听障人士沟通方面。虽然手语是最古老、最有机的交流方式之一,但尼泊尔并没有太多可用的资源来帮助缩小尼泊尔语与美国手语(ASL)之间的交流差距。本研究调查了卷积神经网络(CNN)和人工智能驱动方法在将 ASL 翻译成尼泊尔语文本和语音方面的应用,以弥合技术鸿沟。研究人员利用广泛的 ASL 图像数据集,改进了两个预先训练好的迁移学习模型 ResNet50 和 VGG16,以对 ASL 符号进行分类。该系统利用 Python gTTS 软件包将手势翻译成尼泊尔语文本和语音,并与基于图形用户界面 (GUI) 的 OpenCV 视频输入 TKinter 集成。通过这两种 CNN 架构,该模型的准确率超过 99%,可将 ASL 顺利转换为语音输出。人工智能驱动翻译系统的部署提供了一个可行的解决方案来改善包容性和交流,在降低尼泊尔残疾人必须克服的技术障碍方面迈出了重要的一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信