Lightweight sign language intelligent recognition model based on improved R-C3D

IF 4.3 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Haofei Chen, Chang’an Di
{"title":"Lightweight sign language intelligent recognition model based on improved R-C3D","authors":"Haofei Chen,&nbsp;Chang’an Di","doi":"10.1016/j.eij.2025.100801","DOIUrl":null,"url":null,"abstract":"<div><div>The study proposes a continuous dynamic sign language recognition model based on an improved regional 3D convolutional network. A 3D convolutional network is taken as a special extraction sub-network, and the depth separable convolution is introduced into the 3D convolutional network to reduce computational costs. The inverted residual results are taken to avoid information loss issues. In addition, the pre-selection box size of the optimized region 3D convolutional network is shortened, and the action judgment threshold is increased to improve the action accuracy. The average accuracy of the improved 3D convolutional network was 44.2 %, which was higher than that of other types of feature extraction sub-networks. After reducing the pre-selection box, the average accuracy of the time suggestion sub-network increased from 41.6 % to 44.5 %. The loss value also decreased from 0.5 to 0.46. After increasing the action judgment threshold from 0.5 to 0.7, the loss value decreased from 0.58 to 0.17. The loss value of the 3D convolutional network in the entire improved area was only 0.15, the sign language recognition speed was 183 ms, and the average accuracy was 44.6 %, which was better than those of other sign language recognition schemes. The above results indicate that the improved regional 3D convolutional network can accurately and quickly recognize continuous sign language actions.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"32 ","pages":"Article 100801"},"PeriodicalIF":4.3000,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S111086652500194X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The study proposes a continuous dynamic sign language recognition model based on an improved regional 3D convolutional network. A 3D convolutional network is taken as a special extraction sub-network, and the depth separable convolution is introduced into the 3D convolutional network to reduce computational costs. The inverted residual results are taken to avoid information loss issues. In addition, the pre-selection box size of the optimized region 3D convolutional network is shortened, and the action judgment threshold is increased to improve the action accuracy. The average accuracy of the improved 3D convolutional network was 44.2 %, which was higher than that of other types of feature extraction sub-networks. After reducing the pre-selection box, the average accuracy of the time suggestion sub-network increased from 41.6 % to 44.5 %. The loss value also decreased from 0.5 to 0.46. After increasing the action judgment threshold from 0.5 to 0.7, the loss value decreased from 0.58 to 0.17. The loss value of the 3D convolutional network in the entire improved area was only 0.15, the sign language recognition speed was 183 ms, and the average accuracy was 44.6 %, which was better than those of other sign language recognition schemes. The above results indicate that the improved regional 3D convolutional network can accurately and quickly recognize continuous sign language actions.
基于改进R-C3D的轻量级手语智能识别模型
提出了一种基于改进区域三维卷积网络的连续动态手语识别模型。将三维卷积网络作为一种特殊的提取子网络,并在三维卷积网络中引入深度可分卷积以降低计算成本。残差结果取反以避免信息丢失问题。此外,优化后的区域三维卷积网络缩短了预选框大小,提高了动作判断阈值,提高了动作精度。改进后的三维卷积网络平均准确率为44.2%,高于其他类型的特征提取子网络。减少预选框后,时间建议子网络的平均准确率由41.6%提高到44.5%。损失值也从0.5下降到0.46。动作判断阈值由0.5提高到0.7后,损失值由0.58降低到0.17。三维卷积网络在整个改进区域的损失值仅为0.15,手语识别速度为183 ms,平均准确率为44.6%,优于其他手语识别方案。以上结果表明,改进的区域三维卷积网络能够准确、快速地识别连续的手语动作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Egyptian Informatics Journal
Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research
CiteScore
11.10
自引率
1.90%
发文量
59
审稿时长
110 days
期刊介绍: The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信