A Data Augmentation and Pre-processing Technique for Sign Language Fingerspelling Recognition

24th Irish Machine Vision and Image Processing Conference Pub Date : 2022-08-31 DOI:10.56541/xbav3102

Frank Fowley, Ellen Rushe, Anthony Ventresque

引用次数: 0

Abstract

The reliance of deep learning algorithms on large scale datasets is a significant challenge for sign language recognition (SLR). The shortage of data resources for training SLR models inevitably leads to poor generalisation, especially for low-resource languages. We propose novel data augmentation and preprocessing techniques based on synthetic data generation to overcome these generalisation difficulties. Using these methods, our models achieved a top-1 accuracy of 86.7% and a top-2 accuracy of 95.5% when evaluated against an unseen corpus of Irish Sign Language (ISL) fingerspelling video recordings. We believe that this constitutes a state-of-the-art performance baseline for an Irish Sign Language recognition model when tested on an unseen dataset.

查看原文本刊更多论文

一种用于手语拼写识别的数据增强和预处理技术

深度学习算法对大规模数据集的依赖是手语识别(SLR)的一个重大挑战。训练单反模型的数据资源不足，不可避免地导致泛化效果差，特别是对于资源匮乏的语言。我们提出了新的基于合成数据生成的数据增强和预处理技术来克服这些泛化困难。使用这些方法，我们的模型在对未见过的爱尔兰手语(ISL)指纹拼写视频记录进行评估时，准确率达到了前1名的86.7%和前2名的95.5%。我们相信，当在一个看不见的数据集上测试时，这构成了爱尔兰手语识别模型的最先进的性能基线。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

24th Irish Machine Vision and Image Processing Conference

自引率

0.00%

发文量