A Generic Approach towards Amharic Sign Language Recognition

Netsanet Yigzaw, M. Meshesha, Chala Diriba
{"title":"A Generic Approach towards Amharic Sign Language Recognition","authors":"Netsanet Yigzaw, M. Meshesha, Chala Diriba","doi":"10.1155/2022/1112169","DOIUrl":null,"url":null,"abstract":"In the day-to-day life of communities, good communication channels are crucial for mutual understanding. The hearing-impaired community uses sign language, which is a visual and gestural language. In terms of orientation and expression, it is separate from written and spoken languages. Despite the fact that sign language is an excellent platform for communication among hearing-impaired persons, it has created a communication barrier between hearing-impaired and non-disabled people. To address this issue, researchers have proposed sign language to text translation systems for English and other European languages as a solution. The goal of this research is to design and develop an Amharic digital text converter system using Ethiopian sign language. The proposed system was created with the help of two key deep learning algorithms: a pretrained deep learning model and a Long Short-Term Memory (LSTM). The LSTM was used to extract sequence information from a sequence of image frames of a specific sign language, while the pretrained deep learning model was used to extract features from single frame images. The dataset used to train the algorithms was gathered in video format from Addis Ababa University. Prior to feeding the obtained dataset to the deep learning models, data preprocessing activities such as cleaning and video to image frame segmentation were conducted. The system was trained, validated, and tested using 80%, 10%, and 10% of the 2475 images created during the preprocessing step. Two pretrained deep learning models, EfficientNetB0 and ResNet50, were used in this investigation, and they attained an accuracy of 72.79%. In terms of precision and f1-score, ResNet50 outperformed EfficientNetB0. For the proposed system, a graphical user interface prototype was created, and the best performing model was chosen and implemented. The proposed system can be utilized as a starting point for other researchers to improve upon, based on the outcomes of the experiment. More high-quality training datasets and high-performance training machines, such as GPU-enabled computers, can be added to the system to improve it.","PeriodicalId":192934,"journal":{"name":"Adv. Hum. Comput. Interact.","volume":"05 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Adv. Hum. Comput. Interact.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/1112169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In the day-to-day life of communities, good communication channels are crucial for mutual understanding. The hearing-impaired community uses sign language, which is a visual and gestural language. In terms of orientation and expression, it is separate from written and spoken languages. Despite the fact that sign language is an excellent platform for communication among hearing-impaired persons, it has created a communication barrier between hearing-impaired and non-disabled people. To address this issue, researchers have proposed sign language to text translation systems for English and other European languages as a solution. The goal of this research is to design and develop an Amharic digital text converter system using Ethiopian sign language. The proposed system was created with the help of two key deep learning algorithms: a pretrained deep learning model and a Long Short-Term Memory (LSTM). The LSTM was used to extract sequence information from a sequence of image frames of a specific sign language, while the pretrained deep learning model was used to extract features from single frame images. The dataset used to train the algorithms was gathered in video format from Addis Ababa University. Prior to feeding the obtained dataset to the deep learning models, data preprocessing activities such as cleaning and video to image frame segmentation were conducted. The system was trained, validated, and tested using 80%, 10%, and 10% of the 2475 images created during the preprocessing step. Two pretrained deep learning models, EfficientNetB0 and ResNet50, were used in this investigation, and they attained an accuracy of 72.79%. In terms of precision and f1-score, ResNet50 outperformed EfficientNetB0. For the proposed system, a graphical user interface prototype was created, and the best performing model was chosen and implemented. The proposed system can be utilized as a starting point for other researchers to improve upon, based on the outcomes of the experiment. More high-quality training datasets and high-performance training machines, such as GPU-enabled computers, can be added to the system to improve it.
阿姆哈拉语手语识别的通用方法
在社区的日常生活中,良好的沟通渠道对相互理解至关重要。听障群体使用手语,这是一种视觉和手势语言。就取向和表达而言,它与书面语和口语是分开的。尽管手语是听障人士之间良好的交流平台,但它在听障人士和非听障人士之间造成了沟通障碍。为了解决这个问题,研究人员提出了英语和其他欧洲语言的手语文本翻译系统作为解决方案。本研究的目的是设计和开发一个使用埃塞俄比亚手语的阿姆哈拉语数字文本转换系统。该系统是在两个关键的深度学习算法的帮助下创建的:一个预训练的深度学习模型和一个长短期记忆(LSTM)。使用LSTM从特定的图像帧序列中提取序列信息,而使用预训练的深度学习模型从单帧图像中提取特征。用于训练算法的数据集以视频格式从亚的斯亚贝巴大学收集。在将获得的数据集输入深度学习模型之前,进行数据预处理活动,如清洗和视频到图像帧分割。使用预处理步骤中创建的2475张图像中的80%、10%和10%对系统进行训练、验证和测试。在本次调查中使用了两个预训练的深度学习模型,有效率netb0和ResNet50,它们的准确率达到了72.79%。在精度和f1-score方面,ResNet50优于EfficientNetB0。针对所提出的系统,创建了图形用户界面原型,选择并实现了性能最佳的模型。所提出的系统可以作为其他研究人员根据实验结果进行改进的起点。可以将更多高质量的训练数据集和高性能的训练机器(如支持gpu的计算机)添加到系统中以改进它。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信