Recognition of American sign language using LBG vector quantization

2014 International Conference on Computer Communication and Informatics Pub Date : 2014-10-16 DOI:10.1109/ICCCI.2014.6921745

Krupali Suresh Raut, S. Mali, Sudeep D. Thepade, S. Sanas

{"title":"Recognition of American sign language using LBG vector quantization","authors":"Krupali Suresh Raut, S. Mali, Sudeep D. Thepade, S. Sanas","doi":"10.1109/ICCCI.2014.6921745","DOIUrl":null,"url":null,"abstract":"Sign language is a widely used and accepted standard for communication by people with hearing and speaking impairments. A sign language is a language which uses visually transmitted sign patterns, instead of acoustically conveyed sound patterns, to deliver the meaning. Several ways to generalize the American sign language recognition have been proposed in the past [14] [15]. In this paper, novel method of American sign language recognition with Shape and Texture features has been proposed. Many of the existing systems require the person gesticulating to use special data acquisition devices like data gloves which are expensive and difficult to handle. We proposes a more flexible vision based approach where the person is free from additional equipment. The Shape features and Texture feature are more unique, so a novel technique based on combination of these is derived and proposed here. For extracting shape features standard gradient operator such as Robert, Prewitt, Sobel, Canny, Freichein, Kirsh and Laplace are used and for texture feature vector quantization techniques are used. The gradient mask images of the character images are obtained and then LBG vector quantization algorithm is applied on these gradient images to get the codebooks of various sizes. These obtained LBG codebooks are considered as shape texture feature vectors for American sign language recognition. The database includes 26 for American sign language alphabets taken by 12 different people. The images are saved in a jpeg file format and stored in separate folder. Thus there are total 312 images were use for our project and 8 code book sizes (from 4 to 512). The nearest neighbour (KNN) algorithm is considered as performance comparison criteria for proposed character recognition techniques. The best performance is observed in LBG for codebook size 8 of Canny operator and the next best is seen for codebook sizes 4 of freichein gradient mask for feature extraction. The LBG VQ [11] design algorithm is an iterative algorithm which requires an initial codebook C. This initial codebook is obtained by the splitting method. In this method, an initial code vector is set as the average of the entire training sequence. This code vector is then split into two. The iterative algorithm is run with these two vectors as the initial codebook. The final two code vectors are splitted into four and the process is repeated until the desired number of code vectors is obtained.","PeriodicalId":244242,"journal":{"name":"2014 International Conference on Computer Communication and Informatics","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Computer Communication and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI.2014.6921745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

Sign language is a widely used and accepted standard for communication by people with hearing and speaking impairments. A sign language is a language which uses visually transmitted sign patterns, instead of acoustically conveyed sound patterns, to deliver the meaning. Several ways to generalize the American sign language recognition have been proposed in the past [14] [15]. In this paper, novel method of American sign language recognition with Shape and Texture features has been proposed. Many of the existing systems require the person gesticulating to use special data acquisition devices like data gloves which are expensive and difficult to handle. We proposes a more flexible vision based approach where the person is free from additional equipment. The Shape features and Texture feature are more unique, so a novel technique based on combination of these is derived and proposed here. For extracting shape features standard gradient operator such as Robert, Prewitt, Sobel, Canny, Freichein, Kirsh and Laplace are used and for texture feature vector quantization techniques are used. The gradient mask images of the character images are obtained and then LBG vector quantization algorithm is applied on these gradient images to get the codebooks of various sizes. These obtained LBG codebooks are considered as shape texture feature vectors for American sign language recognition. The database includes 26 for American sign language alphabets taken by 12 different people. The images are saved in a jpeg file format and stored in separate folder. Thus there are total 312 images were use for our project and 8 code book sizes (from 4 to 512). The nearest neighbour (KNN) algorithm is considered as performance comparison criteria for proposed character recognition techniques. The best performance is observed in LBG for codebook size 8 of Canny operator and the next best is seen for codebook sizes 4 of freichein gradient mask for feature extraction. The LBG VQ [11] design algorithm is an iterative algorithm which requires an initial codebook C. This initial codebook is obtained by the splitting method. In this method, an initial code vector is set as the average of the entire training sequence. This code vector is then split into two. The iterative algorithm is run with these two vectors as the initial codebook. The final two code vectors are splitted into four and the process is repeated until the desired number of code vectors is obtained.

查看原文本刊更多论文

基于LBG矢量量化的美国手语识别

手语是有听力和语言障碍的人广泛使用和接受的交流标准。手语是一种用视觉传递的符号模式，而不是用声音传递的声音模式来传递意思的语言。过去已经提出了几种概括美国手语识别的方法[14][15]。本文提出了一种基于形状和纹理特征的美国手语识别新方法。许多现有的系统需要人们用手势来使用特殊的数据采集设备，比如数据手套，这些设备既昂贵又难以操作。我们提出了一种更灵活的基于视觉的方法，在这种方法中，人不需要额外的设备。基于形状特征和纹理特征的独特性，本文提出了一种基于形状特征和纹理特征相结合的图像提取方法。对于形状特征的提取，采用了Robert、Prewitt、Sobel、Canny、Freichein、Kirsh和Laplace等标准梯度算子;对于纹理特征矢量的提取，采用了量化技术。首先得到字符图像的梯度掩模图像，然后对这些梯度图像应用LBG矢量量化算法，得到不同大小的码本。将得到的LBG码本作为美国手语识别的形状纹理特征向量。该数据库包括26个美国手语字母，由12个不同的人使用。图像以jpeg文件格式保存，并存储在单独的文件夹中。因此，我们的项目总共使用了312张图像和8种代码本大小(从4到512)。本文将最近邻(KNN)算法作为字符识别技术的性能比较标准。Canny算子码本大小为8的LBG表现最佳，freichein梯度掩码码本大小为4的LBG表现次之。LBG VQ[11]设计算法是一种迭代算法，需要一个初始码本c，该初始码本通过分裂法获得。在该方法中，初始代码向量被设置为整个训练序列的平均值。然后将这个代码向量分成两个。迭代算法以这两个向量作为初始码本运行。最后的两个代码向量被分成四个，重复这个过程，直到获得所需的代码向量数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 International Conference on Computer Communication and Informatics

自引率

0.00%

发文量