基于动态灰度片段的自监督人脸呈现攻击检测

Usman Muhammad, Mourad Oussalah
{"title":"基于动态灰度片段的自监督人脸呈现攻击检测","authors":"Usman Muhammad, Mourad Oussalah","doi":"10.1109/FG57933.2023.10042547","DOIUrl":null,"url":null,"abstract":"Face presentation attack detection (PAD) plays an important role in defending face recognition systems against presentation attacks. The success of PAD largely relies on supervised learning that requires a huge number of labeled data, which is especially challenging for videos and often requires expert knowledge. To avoid the costly collection of labeled data, this paper presents a novel method for self-supervised video representation learning via motion prediction. To achieve this, we exploit the temporal consistency based on three RGB frames which are acquired at three different times in the video sequence. The obtained frames are then transformed into grayscale images where each image is specified to three different channels such as R(red), G(green), and B(blue) to form a dynamic grayscale snippet (DGS). Motivated by this, the labels are automatically generated to increase the temporal diversity based on DGS by using the different temporal lengths of the videos, which prove to be very helpful for the downstream task. Benefiting from the self-supervised nature of our method, we report the results that outperform existing methods on four public benchmarks, namely, Replay-Attack, MSU-MFSD, CASIA-FASD, and OULU-NPU. Explainability analysis has been carried out through LIME and Grad-CAM techniques to visualize the most important features used in the DGS.","PeriodicalId":318766,"journal":{"name":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","volume":"38 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Self-Supervised Face Presentation Attack Detection with Dynamic Grayscale Snippets\",\"authors\":\"Usman Muhammad, Mourad Oussalah\",\"doi\":\"10.1109/FG57933.2023.10042547\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Face presentation attack detection (PAD) plays an important role in defending face recognition systems against presentation attacks. The success of PAD largely relies on supervised learning that requires a huge number of labeled data, which is especially challenging for videos and often requires expert knowledge. To avoid the costly collection of labeled data, this paper presents a novel method for self-supervised video representation learning via motion prediction. To achieve this, we exploit the temporal consistency based on three RGB frames which are acquired at three different times in the video sequence. The obtained frames are then transformed into grayscale images where each image is specified to three different channels such as R(red), G(green), and B(blue) to form a dynamic grayscale snippet (DGS). Motivated by this, the labels are automatically generated to increase the temporal diversity based on DGS by using the different temporal lengths of the videos, which prove to be very helpful for the downstream task. Benefiting from the self-supervised nature of our method, we report the results that outperform existing methods on four public benchmarks, namely, Replay-Attack, MSU-MFSD, CASIA-FASD, and OULU-NPU. Explainability analysis has been carried out through LIME and Grad-CAM techniques to visualize the most important features used in the DGS.\",\"PeriodicalId\":318766,\"journal\":{\"name\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"volume\":\"38 8\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FG57933.2023.10042547\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FG57933.2023.10042547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

人脸表示攻击检测(PAD)在人脸识别系统防御表示攻击中起着重要作用。PAD的成功很大程度上依赖于监督学习,而监督学习需要大量的标记数据,这对视频来说尤其具有挑战性,通常需要专业知识。为了避免昂贵的标记数据收集,本文提出了一种基于运动预测的自监督视频表示学习方法。为了实现这一点,我们利用了基于三个RGB帧的时间一致性,这些帧是在视频序列的三个不同时间获取的。然后将获得的帧转换为灰度图像,其中每个图像被指定为三个不同的通道,如R(红色),G(绿色)和B(蓝色),以形成动态灰度片段(DGS)。在此激励下,基于DGS自动生成标签,利用视频的不同时间长度来增加时间多样性,这对下游任务非常有帮助。得益于我们方法的自我监督性质,我们报告的结果在四个公共基准上优于现有方法,即重播-攻击,MSU-MFSD, CASIA-FASD和OULU-NPU。可解释性分析已通过LIME和Grad-CAM技术进行,以可视化DGS中使用的最重要特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Self-Supervised Face Presentation Attack Detection with Dynamic Grayscale Snippets
Face presentation attack detection (PAD) plays an important role in defending face recognition systems against presentation attacks. The success of PAD largely relies on supervised learning that requires a huge number of labeled data, which is especially challenging for videos and often requires expert knowledge. To avoid the costly collection of labeled data, this paper presents a novel method for self-supervised video representation learning via motion prediction. To achieve this, we exploit the temporal consistency based on three RGB frames which are acquired at three different times in the video sequence. The obtained frames are then transformed into grayscale images where each image is specified to three different channels such as R(red), G(green), and B(blue) to form a dynamic grayscale snippet (DGS). Motivated by this, the labels are automatically generated to increase the temporal diversity based on DGS by using the different temporal lengths of the videos, which prove to be very helpful for the downstream task. Benefiting from the self-supervised nature of our method, we report the results that outperform existing methods on four public benchmarks, namely, Replay-Attack, MSU-MFSD, CASIA-FASD, and OULU-NPU. Explainability analysis has been carried out through LIME and Grad-CAM techniques to visualize the most important features used in the DGS.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信