基于yolov9的人-动物、复杂成像环境和图像质量下的人脸检测与计数

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access Pub Date : 2025-07-22 DOI:10.1109/ACCESS.2025.3591247

Sivaranjini Perikamana Narayanan;M. Sabarimalai Manikandan;Linga Reddy Cenkeramaddi

{"title":"基于yolov9的人-动物、复杂成像环境和图像质量下的人脸检测与计数","authors":"Sivaranjini Perikamana Narayanan;M. Sabarimalai Manikandan;Linga Reddy Cenkeramaddi","doi":"10.1109/ACCESS.2025.3591247","DOIUrl":null,"url":null,"abstract":"Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"129600-129637"},"PeriodicalIF":3.6000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11087548","citationCount":"0","resultStr":"{\"title\":\"YOLOv9-Based Human Face Detection and Counting Under Human-Animal Faces, Complex Imaging Environments, and Image Qualities\",\"authors\":\"Sivaranjini Perikamana Narayanan;M. Sabarimalai Manikandan;Linga Reddy Cenkeramaddi\",\"doi\":\"10.1109/ACCESS.2025.3591247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"13 \",\"pages\":\"129600-129637\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11087548\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11087548/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11087548/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

人脸自动检测与计数在室外和室内人体监控应用以及面部生命体征测量中对婴儿和成人的人脸识别和跟踪具有重要作用。尽管深度学习网络取得了进步，但在不同种类的物体、动物面孔和图像特征存在的情况下，准确可靠的检测仍然是一项具有挑战性的任务。在本文中，我们研究了基于yolov9的人脸检测与计数方法在肤色变化、不同人脸大小和人脸数量、人和动物混合人脸、图像属性（亮度、对比度、照度、眩光）、图像质量（图像模糊、低光面部图像、镜头闪光图像、夜间拍摄的颗粒图像、不同类型的噪声）和幻想性视错觉效果下的有效性。利用wide Face数据库对该方法进行了训练和验证。此外，针对上述挑战，我们创建了具有不同图像质量和图像特征的图像数据库。基于yolov9的人脸检测模型在wide face数据库上的推断时间为15.2 ms，准确率为86%，召回率为62.8%，平均准确率为70.8%。评价结果表明，基于yolov9的人脸计数方法的平均绝对误差（MAE）为3.36，均方根误差（RMSE）为22.38，优于大多数最先进的人脸检测和人数计数方法。并将该模型部署在树莓派边缘计算平台上，研究其实时性能。基于yolov9的方法在包含高斯噪声、椒盐噪声和斑点噪声的未经训练的婴儿数据库上的MAE为0.53-5.76，在车内人脸上的MAE为0.43-2.87。该研究进一步强调了YOLOv9模型在一系列光照和肤色变化下实现有希望的人脸检测和计数结果的有效性。由于在深层面部模型的生成中缺乏上下文智能，对各种动物面部和空想性错觉诱发的面部的评估结果显示出更多的误报。此外，研究结果表明，如果该模型部署在不受控制的基于人脸的应用环境中，那么该深度人脸模型可以检测人造人脸（雕像、艺术品、绘画、海报）。在不同类型的噪声和模糊图像下，模型的性能下降。本研究的结果强调，通过使用具有最佳滤波参数的噪声特定滤波技术可以提高性能，但这需要自动识别噪声类型及其相应的参数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

YOLOv9-Based Human Face Detection and Counting Under Human-Animal Faces, Complex Imaging Environments, and Image Qualities

Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.