基于计算机视觉的孟加拉手语文本生成

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS) Pub Date : 2022-12-05 DOI:10.1109/IPAS55744.2022.10052928

Tonjih Tazalli, Zarin Anan Aunshu, Sumaya Sadbeen Liya, Magfirah Hossain, Zareen Mehjabeen, M. Ahmed, Muhammad Iqbal Hossain

{"title":"基于计算机视觉的孟加拉手语文本生成","authors":"Tonjih Tazalli, Zarin Anan Aunshu, Sumaya Sadbeen Liya, Magfirah Hossain, Zareen Mehjabeen, M. Ahmed, Muhammad Iqbal Hossain","doi":"10.1109/IPAS55744.2022.10052928","DOIUrl":null,"url":null,"abstract":"In the whole world, around 7% of people have hearing and speech impairment problems. They use sign language as their communication method. As for our country, there are lots of people born with hearing and speech impairment problems. Therefore, our primary focus is to work for those people by converting Bangla sign language into text. There are already various projects on Bangla sign language done by other people. However, they focused more on the separate alphabets and numerical numbers. That is why, we want to concentrate on Bangla word signs since communication is done using words or phrases rather than alphabets. There is no proper database for Bangla word sign language, so we want to make a database for our work using BDSL. In recognition of sign language (SLR), there usually are two types of scenarios: isolated SLR, which takes words by word and completes recognize action, and the other one is continuous SLR, which completes action by translating the whole sentence at once. We are working on isolated SLR. We introduce a method where we are going to use PyTorch and YOLOv5 for a video classification model to convert Bangla sign language into the text from the video where each video has only one sign language word. Here, we have achieved an accuracy rate of 76.29% on the training dataset and 51.44% on the testing dataset. We are working to build a system that will make it easier for hearing and speech-disabled people to interact with the general public.","PeriodicalId":322228,"journal":{"name":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Computer Vision-Based Bengali Sign Language To Text Generation\",\"authors\":\"Tonjih Tazalli, Zarin Anan Aunshu, Sumaya Sadbeen Liya, Magfirah Hossain, Zareen Mehjabeen, M. Ahmed, Muhammad Iqbal Hossain\",\"doi\":\"10.1109/IPAS55744.2022.10052928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the whole world, around 7% of people have hearing and speech impairment problems. They use sign language as their communication method. As for our country, there are lots of people born with hearing and speech impairment problems. Therefore, our primary focus is to work for those people by converting Bangla sign language into text. There are already various projects on Bangla sign language done by other people. However, they focused more on the separate alphabets and numerical numbers. That is why, we want to concentrate on Bangla word signs since communication is done using words or phrases rather than alphabets. There is no proper database for Bangla word sign language, so we want to make a database for our work using BDSL. In recognition of sign language (SLR), there usually are two types of scenarios: isolated SLR, which takes words by word and completes recognize action, and the other one is continuous SLR, which completes action by translating the whole sentence at once. We are working on isolated SLR. We introduce a method where we are going to use PyTorch and YOLOv5 for a video classification model to convert Bangla sign language into the text from the video where each video has only one sign language word. Here, we have achieved an accuracy rate of 76.29% on the training dataset and 51.44% on the testing dataset. We are working to build a system that will make it easier for hearing and speech-disabled people to interact with the general public.\",\"PeriodicalId\":322228,\"journal\":{\"name\":\"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPAS55744.2022.10052928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPAS55744.2022.10052928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

全世界约有7%的人有听力和语言障碍问题。他们用手语作为交流的方式。就我国而言，有很多人天生就有听力和语言障碍问题。因此，我们的主要重点是通过将孟加拉国手语转换为文本来为这些人工作。其他人已经完成了各种关于孟加拉手语的项目。然而，他们更关注单独的字母和数字。这就是为什么我们想把重点放在孟加拉语的单词符号上，因为交流是用单词或短语而不是字母来完成的。孟加拉语文字手语没有合适的数据库，所以我们想用BDSL为我们的工作建立一个数据库。在手语识别中，通常有两种场景:一种是孤立的单反，即一个词一个词地完成识别动作;另一种是连续的单反，即通过一次翻译整个句子来完成动作。我们正在研究独立的单反。我们介绍了一种方法，我们将使用PyTorch和YOLOv5作为视频分类模型，将孟加拉语手语转换为视频中的文本，其中每个视频只有一个手语单词。在这里，我们在训练数据集中实现了76.29%的准确率，在测试数据集中实现了51.44%的准确率。我们正在努力建立一个系统，使听障和语言障碍者更容易与公众互动。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Computer Vision-Based Bengali Sign Language To Text Generation

In the whole world, around 7% of people have hearing and speech impairment problems. They use sign language as their communication method. As for our country, there are lots of people born with hearing and speech impairment problems. Therefore, our primary focus is to work for those people by converting Bangla sign language into text. There are already various projects on Bangla sign language done by other people. However, they focused more on the separate alphabets and numerical numbers. That is why, we want to concentrate on Bangla word signs since communication is done using words or phrases rather than alphabets. There is no proper database for Bangla word sign language, so we want to make a database for our work using BDSL. In recognition of sign language (SLR), there usually are two types of scenarios: isolated SLR, which takes words by word and completes recognize action, and the other one is continuous SLR, which completes action by translating the whole sentence at once. We are working on isolated SLR. We introduce a method where we are going to use PyTorch and YOLOv5 for a video classification model to convert Bangla sign language into the text from the video where each video has only one sign language word. Here, we have achieved an accuracy rate of 76.29% on the training dataset and 51.44% on the testing dataset. We are working to build a system that will make it easier for hearing and speech-disabled people to interact with the general public.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS)

自引率

0.00%

发文量