使用3D动画手语的聋人对话:使用微软kinect v2的手语翻译

2016 SAI Computing Conference (SAI) Pub Date : 2016-07-13 DOI:10.1109/SAI.2016.7556002

Mateen Ahmed, Mujtaba Idrees, Zain Ul Abideen, R. Mumtaz, S. Khalique

{"title":"使用3D动画手语的聋人对话:使用微软kinect v2的手语翻译","authors":"Mateen Ahmed, Mujtaba Idrees, Zain Ul Abideen, R. Mumtaz, S. Khalique","doi":"10.1109/SAI.2016.7556002","DOIUrl":null,"url":null,"abstract":"This paper describes a neoteric approach to bridge the communication gap between deaf people and normal human beings. In any community there exists such group of disable people who face severe difficulties in communication due to their speech and hearing impediments. Such people use various gestures and symbols to talk and receive their messages and this mode of communication is called sign language. Yet the communication problem doesn't end here, as natural language speakers don't understand sign language resulting in a communication gap. Towards such ends there is a need to develop a system which can act as an interpreter for sign language speakers and a translator for natural language speaker. For this purpose, a software based solution has been developed in this research by exploiting the latest technologies from Microsoft i.e. Kinect for windows V2. The proposed system is dubbed as Deaf Talk, and it acts as a sign language interpreter and translator to provide a dual mode of communication between sign language speakers and natural language speakers. The dual mode of communication has following independent modules (1) Sign/Gesture to speech conversion (2) Speech to sign language conversion. In sign to speech conversion module, the person with speech inhibition has to place himself within Kinect's field of view (FOV) and then performs the sign language gestures. The system receives the performed gestures through Kinect sensor and then comprehends those gestures by comparing them with the trained gestures already stored in the database. Once the gesture is determined, it is mapped to the keyword corresponding to that gesture. The keywords are then sent to text to speech conversion module, which speaks or plays the sentence for natural language speaker. In contrast to sign to speech conversion, the speech to sign language conversion module translates the spoken language to sign language. In this case, the normal person places himself in the Kinect sensor's FOV and speaks in his native language (English for this case). The system then converts it into text using speech to text API. The keywords are then mapped to their corresponding pre-stored animated gestures and then animations are played on the screen for the spoken sentence. In this way the disable person can visualize the spoken sentence, translated into a 3D animated sign language. The accuracy of Deaf Talk is 87 percent for speech to sign language conversion and 84 percent for sign language to speech conversion.","PeriodicalId":219896,"journal":{"name":"2016 SAI Computing Conference (SAI)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Deaf talk using 3D animated sign language: A sign language interpreter using Microsoft's kinect v2\",\"authors\":\"Mateen Ahmed, Mujtaba Idrees, Zain Ul Abideen, R. Mumtaz, S. Khalique\",\"doi\":\"10.1109/SAI.2016.7556002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a neoteric approach to bridge the communication gap between deaf people and normal human beings. In any community there exists such group of disable people who face severe difficulties in communication due to their speech and hearing impediments. Such people use various gestures and symbols to talk and receive their messages and this mode of communication is called sign language. Yet the communication problem doesn't end here, as natural language speakers don't understand sign language resulting in a communication gap. Towards such ends there is a need to develop a system which can act as an interpreter for sign language speakers and a translator for natural language speaker. For this purpose, a software based solution has been developed in this research by exploiting the latest technologies from Microsoft i.e. Kinect for windows V2. The proposed system is dubbed as Deaf Talk, and it acts as a sign language interpreter and translator to provide a dual mode of communication between sign language speakers and natural language speakers. The dual mode of communication has following independent modules (1) Sign/Gesture to speech conversion (2) Speech to sign language conversion. In sign to speech conversion module, the person with speech inhibition has to place himself within Kinect's field of view (FOV) and then performs the sign language gestures. The system receives the performed gestures through Kinect sensor and then comprehends those gestures by comparing them with the trained gestures already stored in the database. Once the gesture is determined, it is mapped to the keyword corresponding to that gesture. The keywords are then sent to text to speech conversion module, which speaks or plays the sentence for natural language speaker. In contrast to sign to speech conversion, the speech to sign language conversion module translates the spoken language to sign language. In this case, the normal person places himself in the Kinect sensor's FOV and speaks in his native language (English for this case). The system then converts it into text using speech to text API. The keywords are then mapped to their corresponding pre-stored animated gestures and then animations are played on the screen for the spoken sentence. In this way the disable person can visualize the spoken sentence, translated into a 3D animated sign language. The accuracy of Deaf Talk is 87 percent for speech to sign language conversion and 84 percent for sign language to speech conversion.\",\"PeriodicalId\":219896,\"journal\":{\"name\":\"2016 SAI Computing Conference (SAI)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 SAI Computing Conference (SAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SAI.2016.7556002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 SAI Computing Conference (SAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAI.2016.7556002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

摘要

本文介绍了一种新的方法来弥合聋人与正常人之间的沟通差距。在任何社区都存在这样一群残疾人，他们由于语言和听力障碍而面临着严重的沟通困难。这些人使用各种手势和符号来交谈和接收他们的信息，这种交流方式被称为手语。然而，沟通问题并没有就此结束，因为说自然语言的人不理解手语，导致了沟通鸿沟。为了实现这一目标，有必要开发一种系统，可以作为手语使用者的口译员和自然语言使用者的翻译。为此，本研究利用微软的最新技术，即Kinect For windows V2，开发了一种基于软件的解决方案。该系统被称为“聋人谈话”，它充当手语翻译和翻译员，为手语使用者和自然语言使用者提供双重交流模式。这种双重交流模式有以下几个独立的模块:(1)手语/手势到语音的转换(2)语音到手语的转换。在手势到语音转换模块中，有语言抑制的人必须将自己置于Kinect的视野(FOV)中，然后执行手语手势。系统通过Kinect传感器接收执行的手势，然后通过将这些手势与已经存储在数据库中的训练手势进行比较来理解这些手势。一旦确定了手势，它就被映射到与该手势对应的关键字。然后将关键词发送到文本到语音的转换模块，该模块为自然语言使用者朗读或播放句子。与手语到语音转换相反，语音到手语转换模块将口语翻译成手语。在这种情况下，正常人将自己置于Kinect传感器的FOV中，并使用自己的母语(在这种情况下为英语)说话。然后系统使用语音到文本API将其转换为文本。然后将关键字映射到相应的预先存储的动画手势，然后在屏幕上播放口语句子的动画。通过这种方式，残疾人可以可视化说出的句子，并将其翻译成3D动画手语。《聋人谈话》从语音到手语的转换准确率为87%，从手语到语音的转换准确率为84%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deaf talk using 3D animated sign language: A sign language interpreter using Microsoft's kinect v2

This paper describes a neoteric approach to bridge the communication gap between deaf people and normal human beings. In any community there exists such group of disable people who face severe difficulties in communication due to their speech and hearing impediments. Such people use various gestures and symbols to talk and receive their messages and this mode of communication is called sign language. Yet the communication problem doesn't end here, as natural language speakers don't understand sign language resulting in a communication gap. Towards such ends there is a need to develop a system which can act as an interpreter for sign language speakers and a translator for natural language speaker. For this purpose, a software based solution has been developed in this research by exploiting the latest technologies from Microsoft i.e. Kinect for windows V2. The proposed system is dubbed as Deaf Talk, and it acts as a sign language interpreter and translator to provide a dual mode of communication between sign language speakers and natural language speakers. The dual mode of communication has following independent modules (1) Sign/Gesture to speech conversion (2) Speech to sign language conversion. In sign to speech conversion module, the person with speech inhibition has to place himself within Kinect's field of view (FOV) and then performs the sign language gestures. The system receives the performed gestures through Kinect sensor and then comprehends those gestures by comparing them with the trained gestures already stored in the database. Once the gesture is determined, it is mapped to the keyword corresponding to that gesture. The keywords are then sent to text to speech conversion module, which speaks or plays the sentence for natural language speaker. In contrast to sign to speech conversion, the speech to sign language conversion module translates the spoken language to sign language. In this case, the normal person places himself in the Kinect sensor's FOV and speaks in his native language (English for this case). The system then converts it into text using speech to text API. The keywords are then mapped to their corresponding pre-stored animated gestures and then animations are played on the screen for the spoken sentence. In this way the disable person can visualize the spoken sentence, translated into a 3D animated sign language. The accuracy of Deaf Talk is 87 percent for speech to sign language conversion and 84 percent for sign language to speech conversion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 SAI Computing Conference (SAI)

自引率

0.00%

发文量