Temporal Action Segmentation in Sign Language System for Bahasa Indonesia (SIBI) Videos Using Optical Flow-Based Approach

I. Dewa, Made Bayu, Atmaja Darmawan, Linawati, Gede Sukadarmika, Ni Made Ary, E. D. Wirastuti, Reza Pulungan
{"title":"Temporal Action Segmentation in Sign Language System for Bahasa Indonesia (SIBI) Videos Using Optical Flow-Based Approach","authors":"I. Dewa, Made Bayu, Atmaja Darmawan, Linawati, Gede Sukadarmika, Ni Made Ary, E. D. Wirastuti, Reza Pulungan","doi":"10.21609/jiki.v17i2.1284","DOIUrl":null,"url":null,"abstract":"Sign language (SL) is vital in fostering communication for the deaf and hard-of-hearing communities. Continuous Sign Language Translation (CSLT) is a work that translates sign language into spoken language. CSLT translation is done by changing continuous forms into isolated signs. Segmenting morpheme signs from phrase signs has several challenges, such as the availability of annotated datasets and the complexity of continuous gesture movements. The Indonesian Sign Language (SIBI) system follows Indonesian grammatical norms, including word formation, in contrast to other sign languages with rules derived from their spoken language. In SIBI, a word can consist of a root word and an affix word. Therefore, temporal action segmentation in SIBI is important to reconstruct the results of translating each sign into spoken Indonesian sentences. This research uses an optical flow approach to segment temporal actions in SIBI videos. Optical flow methods that calculate changes in intensity between adjacent frames can be used to determine the occurrence of sign movement or vice versa to determine the delay between sign movements. The absence of intensity differences between the two frames indicates the boundary between sign gestures. This study tested the use of dense optical flow on videos containing SIBI sentences taken from 3 signers. Evaluation is done on several parameters in the dense optical flow algorithm, such as threshold size, PyrScale, and WinSize, to obtain the best accuracy. This paper shows that the optical flow algorithm successfully performs segmentation, as measured by Perf and F1r. The experimental results showed that the highest Perf and F1r yields were 0.8298 and 0.8524, respectively.","PeriodicalId":31392,"journal":{"name":"Jurnal Ilmu Komputer dan Informasi","volume":"3 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Ilmu Komputer dan Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21609/jiki.v17i2.1284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Sign language (SL) is vital in fostering communication for the deaf and hard-of-hearing communities. Continuous Sign Language Translation (CSLT) is a work that translates sign language into spoken language. CSLT translation is done by changing continuous forms into isolated signs. Segmenting morpheme signs from phrase signs has several challenges, such as the availability of annotated datasets and the complexity of continuous gesture movements. The Indonesian Sign Language (SIBI) system follows Indonesian grammatical norms, including word formation, in contrast to other sign languages with rules derived from their spoken language. In SIBI, a word can consist of a root word and an affix word. Therefore, temporal action segmentation in SIBI is important to reconstruct the results of translating each sign into spoken Indonesian sentences. This research uses an optical flow approach to segment temporal actions in SIBI videos. Optical flow methods that calculate changes in intensity between adjacent frames can be used to determine the occurrence of sign movement or vice versa to determine the delay between sign movements. The absence of intensity differences between the two frames indicates the boundary between sign gestures. This study tested the use of dense optical flow on videos containing SIBI sentences taken from 3 signers. Evaluation is done on several parameters in the dense optical flow algorithm, such as threshold size, PyrScale, and WinSize, to obtain the best accuracy. This paper shows that the optical flow algorithm successfully performs segmentation, as measured by Perf and F1r. The experimental results showed that the highest Perf and F1r yields were 0.8298 and 0.8524, respectively.
使用基于光流的方法在印尼语手语系统(SIBI)视频中进行时态动作分割
手语(SL)对于促进聋人和重听者群体的交流至关重要。连续手语翻译(CSLT)是一种将手语翻译成口语的工作。CSLT 翻译是通过将连续的形式转换为孤立的符号来完成的。从短语手势中分离语素手势面临着一些挑战,如注释数据集的可用性和连续手势动作的复杂性。印尼手语(SIBI)系统遵循印尼语法规范,包括词的构成,这与其他手语的规则源自其口语形成鲜明对比。在 SIBI 中,一个词可以由一个词根词和一个词缀词组成。因此,SIBI 中的时间动作分割对于将每个手势翻译成印尼口语句子的结果重建非常重要。本研究采用光流方法来分割 SIBI 视频中的时间动作。计算相邻帧之间强度变化的光流方法可用于确定手势动作的发生,反之亦然,可用于确定手势动作之间的延迟。两帧之间没有强度差异则表示符号手势之间的边界。本研究测试了密集光流在包含 SIBI 句子的视频中的应用,这些句子来自 3 个手势。对密集光流算法中的几个参数(如阈值大小、PyrScale 和 WinSize)进行了评估,以获得最佳精度。本文通过 Perf 和 F1r 的测量结果表明,光流算法成功地进行了分割。实验结果表明,最高的 Perf 和 F1r 值分别为 0.8298 和 0.8524。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
4 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信