Weakly-supervised Fingerspelling Recognition in British Sign Language Videos

Prajwal K R, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman
{"title":"Weakly-supervised Fingerspelling Recognition in British Sign Language Videos","authors":"Prajwal K R, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman","doi":"10.48550/arXiv.2211.08954","DOIUrl":null,"url":null,"abstract":"The goal of this work is to detect and recognize sequences of letters signed using fingerspelling in British Sign Language (BSL). Previous fingerspelling recognition methods have not focused on BSL, which has a very different signing alphabet (e.g., two-handed instead of one-handed) to American Sign Language (ASL). They also use manual annotations for training. In contrast to previous methods, our method only uses weak annotations from subtitles for training. We localize potential instances of fingerspelling using a simple feature similarity method, then automatically annotate these instances by querying subtitle words and searching for corresponding mouthing cues from the signer. We propose a Transformer architecture adapted to this task, with a multiple-hypothesis CTC loss function to learn from alternative annotation possibilities. We employ a multi-stage training approach, where we make use of an initial version of our trained model to extend and enhance our training data before re-training again to achieve better performance. Through extensive evaluations, we verify our method for automatic annotation and our model architecture. Moreover, we provide a human expert annotated test set of 5K video clips for evaluating BSL fingerspelling recognition methods to support sign language research.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"43 1-2 1","pages":"609"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.08954","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The goal of this work is to detect and recognize sequences of letters signed using fingerspelling in British Sign Language (BSL). Previous fingerspelling recognition methods have not focused on BSL, which has a very different signing alphabet (e.g., two-handed instead of one-handed) to American Sign Language (ASL). They also use manual annotations for training. In contrast to previous methods, our method only uses weak annotations from subtitles for training. We localize potential instances of fingerspelling using a simple feature similarity method, then automatically annotate these instances by querying subtitle words and searching for corresponding mouthing cues from the signer. We propose a Transformer architecture adapted to this task, with a multiple-hypothesis CTC loss function to learn from alternative annotation possibilities. We employ a multi-stage training approach, where we make use of an initial version of our trained model to extend and enhance our training data before re-training again to achieve better performance. Through extensive evaluations, we verify our method for automatic annotation and our model architecture. Moreover, we provide a human expert annotated test set of 5K video clips for evaluating BSL fingerspelling recognition methods to support sign language research.
英国手语视频中的弱监督拼写识别
这项工作的目标是检测和识别在英国手语(BSL)中使用手指拼写的字母序列。以前的手指拼写识别方法并没有关注BSL,它与美国手语(ASL)有着非常不同的手语字母表(例如,用双手而不是单手)。他们还使用手动注释进行训练。与之前的方法相比,我们的方法只使用来自字幕的弱注释进行训练。我们使用一种简单的特征相似度方法来定位潜在的手指拼写实例,然后通过查询字幕词和从签名者那里搜索相应的口型线索来自动注释这些实例。我们提出了一个适合于此任务的Transformer架构,其中包含一个多假设CTC损失函数,以从其他注释可能性中学习。我们采用多阶段训练方法,在再次训练之前,我们利用训练模型的初始版本扩展和增强我们的训练数据,以获得更好的表现。通过广泛的评估,我们验证了自动注释的方法和模型体系结构。此外,我们还提供了一个5K视频片段的人类专家注释测试集,用于评估BSL指纹拼写识别方法,为手语研究提供支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信