利用对准损失推进LWIR波段眼中心检测和人脸识别

IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-03-02 DOI:10.1109/TBIOM.2023.3251738

Suha Reddy Mokalla;Thirimachos Bourlai

{"title":"利用对准损失推进LWIR波段眼中心检测和人脸识别","authors":"Suha Reddy Mokalla;Thirimachos Bourlai","doi":"10.1109/TBIOM.2023.3251738","DOIUrl":null,"url":null,"abstract":"Geometric normalization is an integral part of most of the face recognition (FR) systems. To geometrically normalize a face, it is essential to detect the eye centers, since one way to align the face images is to make the line joining the eye centers horizontal. This paper proposes a novel approach to detect eye centers in the challenging Long-Wave Infrared (LWIR) spectrum (8-\n<inline-formula> <tex-math>$14 ~\\mu \\text{m}$ </tex-math></inline-formula>\n). While using thermal band images for face recognition is a feasible approach in low-light and nighttime conditions, where visible face images cannot be used, there are not many thermal or dual band (visible and thermal) face datasets available to train and test new eye center detection models. This work takes advantage of the available deep learning based eye center detection algorithms in the visible band to detect the eye centers in thermal face images through image synthesis. While we empirically evaluate different image synthesis models, we determine that StarGAN2 yields the highest eye center detection accuracy, when compared to the other state-of-the-art models. We incorporate alignment loss that captures the normalized error between the detected and actual eye centers as an additional loss term during training (using the generated images during training, ground truth annotations, and an eye center detection model), so that the model learns to align the images to minimize this error. During test phase, visible images are generated from the thermal images using the trained model. Then, the available landmark detection algorithms in the visible band, namely, MT-CNN and HR-Net are used to detect the eye centers. Next, these eye centers are used to geometrically normalize the source thermal face images before performing same-spectral (thermal-to-thermal) face recognition. The proposed method improved the eye center detection accuracy by 60% over the baseline model, and by 14% over training only the StarGAN2 model without the alignment loss. The proposed approach also reports the highest improvement in the face recognition accuracy by 36% and 3% over the baseline and original StarGAN2 models, respectively, when using deep learning based face recognition models, namely, Facenet, ArcFace, and VGG-Face. We also perform experiments by augmenting the train and test datasets with images rotated in-plane to further demonstrate the efficiency of the proposed approach. When CycleGAN (another unpaired image translation network) is used to generate images before incorporating the alignment loss, it failed to preserve the alignment at the slightest, therefore the eye center detection accuracy was extremely low. With the alignment loss, the accuracy increased by 20%, 50%, and 80% when the normalized error (e) \n<inline-formula> <tex-math>$\\le0.05$ </tex-math></inline-formula>\n, 0.10 and 0.25 respectively.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"255-265"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Utilizing Alignment Loss to Advance Eye Center Detection and Face Recognition in the LWIR Band\",\"authors\":\"Suha Reddy Mokalla;Thirimachos Bourlai\",\"doi\":\"10.1109/TBIOM.2023.3251738\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Geometric normalization is an integral part of most of the face recognition (FR) systems. To geometrically normalize a face, it is essential to detect the eye centers, since one way to align the face images is to make the line joining the eye centers horizontal. This paper proposes a novel approach to detect eye centers in the challenging Long-Wave Infrared (LWIR) spectrum (8-\\n<inline-formula> <tex-math>$14 ~\\\\mu \\\\text{m}$ </tex-math></inline-formula>\\n). While using thermal band images for face recognition is a feasible approach in low-light and nighttime conditions, where visible face images cannot be used, there are not many thermal or dual band (visible and thermal) face datasets available to train and test new eye center detection models. This work takes advantage of the available deep learning based eye center detection algorithms in the visible band to detect the eye centers in thermal face images through image synthesis. While we empirically evaluate different image synthesis models, we determine that StarGAN2 yields the highest eye center detection accuracy, when compared to the other state-of-the-art models. We incorporate alignment loss that captures the normalized error between the detected and actual eye centers as an additional loss term during training (using the generated images during training, ground truth annotations, and an eye center detection model), so that the model learns to align the images to minimize this error. During test phase, visible images are generated from the thermal images using the trained model. Then, the available landmark detection algorithms in the visible band, namely, MT-CNN and HR-Net are used to detect the eye centers. Next, these eye centers are used to geometrically normalize the source thermal face images before performing same-spectral (thermal-to-thermal) face recognition. The proposed method improved the eye center detection accuracy by 60% over the baseline model, and by 14% over training only the StarGAN2 model without the alignment loss. The proposed approach also reports the highest improvement in the face recognition accuracy by 36% and 3% over the baseline and original StarGAN2 models, respectively, when using deep learning based face recognition models, namely, Facenet, ArcFace, and VGG-Face. We also perform experiments by augmenting the train and test datasets with images rotated in-plane to further demonstrate the efficiency of the proposed approach. When CycleGAN (another unpaired image translation network) is used to generate images before incorporating the alignment loss, it failed to preserve the alignment at the slightest, therefore the eye center detection accuracy was extremely low. With the alignment loss, the accuracy increased by 20%, 50%, and 80% when the normalized error (e) \\n<inline-formula> <tex-math>$\\\\le0.05$ </tex-math></inline-formula>\\n, 0.10 and 0.25 respectively.\",\"PeriodicalId\":73307,\"journal\":{\"name\":\"IEEE transactions on biometrics, behavior, and identity science\",\"volume\":\"5 2\",\"pages\":\"255-265\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on biometrics, behavior, and identity science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10057458/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on biometrics, behavior, and identity science","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10057458/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

几何归一化是人脸识别系统的重要组成部分。为了在几何上标准化一张脸，检测眼睛中心是必不可少的，因为对齐人脸图像的一种方法是使连接眼睛中心的线水平。本文提出了一种在具有挑战性的长波红外(LWIR)光谱(8- $14 ~\mu \text{m}$)中检测眼睛中心的新方法。虽然在低光和夜间条件下使用热带图像进行人脸识别是一种可行的方法，但在这种情况下，无法使用可见光人脸图像，因此没有很多热带或双波段(可见光和热)人脸数据集可用于训练和测试新的眼中心检测模型。本研究利用现有的基于深度学习的可见光波段眼中心检测算法，通过图像合成来检测热人脸图像中的眼中心。虽然我们对不同的图像合成模型进行了经验评估，但与其他最先进的模型相比，我们确定StarGAN2产生了最高的眼中心检测精度。我们将捕获检测到的和实际眼中心之间归一化误差的对齐损失作为训练期间的附加损失项(使用训练期间生成的图像，地面真值注释和眼中心检测模型)，以便模型学习对齐图像以最小化该误差。在测试阶段，使用训练好的模型从热图像生成可见图像。然后，利用现有的可见光波段地标检测算法，即MT-CNN和HR-Net，对眼睛中心进行检测。接下来，在进行同光谱(热-热)人脸识别之前，使用这些眼中心对源热人脸图像进行几何归一化。该方法将眼中心检测精度提高了60％% over the baseline model, and by 14% over training only the StarGAN2 model without the alignment loss. The proposed approach also reports the highest improvement in the face recognition accuracy by 36% and 3% over the baseline and original StarGAN2 models, respectively, when using deep learning based face recognition models, namely, Facenet, ArcFace, and VGG-Face. We also perform experiments by augmenting the train and test datasets with images rotated in-plane to further demonstrate the efficiency of the proposed approach. When CycleGAN (another unpaired image translation network) is used to generate images before incorporating the alignment loss, it failed to preserve the alignment at the slightest, therefore the eye center detection accuracy was extremely low. With the alignment loss, the accuracy increased by 20%, 50%, and 80% when the normalized error (e) $\le0.05$ , 0.10 and 0.25 respectively.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Utilizing Alignment Loss to Advance Eye Center Detection and Face Recognition in the LWIR Band

Geometric normalization is an integral part of most of the face recognition (FR) systems. To geometrically normalize a face, it is essential to detect the eye centers, since one way to align the face images is to make the line joining the eye centers horizontal. This paper proposes a novel approach to detect eye centers in the challenging Long-Wave Infrared (LWIR) spectrum (8-

$14 ~\mu \text{m}$

). While using thermal band images for face recognition is a feasible approach in low-light and nighttime conditions, where visible face images cannot be used, there are not many thermal or dual band (visible and thermal) face datasets available to train and test new eye center detection models. This work takes advantage of the available deep learning based eye center detection algorithms in the visible band to detect the eye centers in thermal face images through image synthesis. While we empirically evaluate different image synthesis models, we determine that StarGAN2 yields the highest eye center detection accuracy, when compared to the other state-of-the-art models. We incorporate alignment loss that captures the normalized error between the detected and actual eye centers as an additional loss term during training (using the generated images during training, ground truth annotations, and an eye center detection model), so that the model learns to align the images to minimize this error. During test phase, visible images are generated from the thermal images using the trained model. Then, the available landmark detection algorithms in the visible band, namely, MT-CNN and HR-Net are used to detect the eye centers. Next, these eye centers are used to geometrically normalize the source thermal face images before performing same-spectral (thermal-to-thermal) face recognition. The proposed method improved the eye center detection accuracy by 60% over the baseline model, and by 14% over training only the StarGAN2 model without the alignment loss. The proposed approach also reports the highest improvement in the face recognition accuracy by 36% and 3% over the baseline and original StarGAN2 models, respectively, when using deep learning based face recognition models, namely, Facenet, ArcFace, and VGG-Face. We also perform experiments by augmenting the train and test datasets with images rotated in-plane to further demonstrate the efficiency of the proposed approach. When CycleGAN (another unpaired image translation network) is used to generate images before incorporating the alignment loss, it failed to preserve the alignment at the slightest, therefore the eye center detection accuracy was extremely low. With the alignment loss, the accuracy increased by 20%, 50%, and 80% when the normalized error (e)

$\le0.05$

, 0.10 and 0.25 respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on biometrics, behavior, and identity science

CiteScore

10.90

自引率

0.00%

发文量