Multi-Scale Feature Fusion Network for Lip Recognition

Haohuai Lin, Bowen Liu, Gangdong Zhang, Qiang Yin, Liuqing Yang, Ping Lan
{"title":"Multi-Scale Feature Fusion Network for Lip Recognition","authors":"Haohuai Lin, Bowen Liu, Gangdong Zhang, Qiang Yin, Liuqing Yang, Ping Lan","doi":"10.1109/ICPECA60615.2024.10471068","DOIUrl":null,"url":null,"abstract":"Visual speech recognition (VSR) is also known as lip recognition. Recently, it has been widely explored due to the development of deep learning. Lip recognition is a discrimination issue, where the information provided by the delicate movement of the lips is most remarkable of all. This places a higher demand on the model's ability to extract features of minor variation around the lips. In this paper, a three-dimensional convolutional network (3D CNN) multi-branch feature fusion network is proposed for extracting spatiotemporal featuresof continuous images. The features of multi-branch feature fusion network are utilized to fully extract partial and general characteristics from sequential imagery and further enhance the feature information to deliver more accurate function info to the back-end classification network. The excellence of quite a few methods requires the support of huge volume of data, and in favor of test the effect of small-scale data sets. This experimentis conducted using the Oulu Vs2dataset to obtain exciting experimental results. After 20 iterations of the experiment, the maximum accuracy absolutely improves by 0.8% and the average accuracy improves by 1%.","PeriodicalId":518671,"journal":{"name":"2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA)","volume":"55 4","pages":"541-545"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPECA60615.2024.10471068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Visual speech recognition (VSR) is also known as lip recognition. Recently, it has been widely explored due to the development of deep learning. Lip recognition is a discrimination issue, where the information provided by the delicate movement of the lips is most remarkable of all. This places a higher demand on the model's ability to extract features of minor variation around the lips. In this paper, a three-dimensional convolutional network (3D CNN) multi-branch feature fusion network is proposed for extracting spatiotemporal featuresof continuous images. The features of multi-branch feature fusion network are utilized to fully extract partial and general characteristics from sequential imagery and further enhance the feature information to deliver more accurate function info to the back-end classification network. The excellence of quite a few methods requires the support of huge volume of data, and in favor of test the effect of small-scale data sets. This experimentis conducted using the Oulu Vs2dataset to obtain exciting experimental results. After 20 iterations of the experiment, the maximum accuracy absolutely improves by 0.8% and the average accuracy improves by 1%.
用于唇语识别的多尺度特征融合网络
视觉语音识别(VSR)又称唇语识别。最近,由于深度学习的发展,它得到了广泛的探索。嘴唇识别是一个辨别问题,其中嘴唇的微妙运动所提供的信息最为显著。这就对模型提取嘴唇周围细微变化特征的能力提出了更高的要求。本文提出了一种三维卷积网络(3D CNN)多分支特征融合网络,用于提取连续图像的时空特征。利用多分支特征融合网络的特征从连续图像中充分提取局部和总体特征,并进一步增强特征信息,从而为后端分类网络提供更准确的功能信息。不少方法的优劣需要海量数据的支持,而小规模数据集则有利于测试效果。本实验使用奥卢 Vs2 数据集进行,获得了令人振奋的实验结果。经过 20 次迭代实验后,最大准确率绝对提高了 0.8%,平均准确率提高了 1%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信