{"title":"基于yolov9的人-动物、复杂成像环境和图像质量下的人脸检测与计数","authors":"Sivaranjini Perikamana Narayanan;M. Sabarimalai Manikandan;Linga Reddy Cenkeramaddi","doi":"10.1109/ACCESS.2025.3591247","DOIUrl":null,"url":null,"abstract":"Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"129600-129637"},"PeriodicalIF":3.6000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11087548","citationCount":"0","resultStr":"{\"title\":\"YOLOv9-Based Human Face Detection and Counting Under Human-Animal Faces, Complex Imaging Environments, and Image Qualities\",\"authors\":\"Sivaranjini Perikamana Narayanan;M. Sabarimalai Manikandan;Linga Reddy Cenkeramaddi\",\"doi\":\"10.1109/ACCESS.2025.3591247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"13 \",\"pages\":\"129600-129637\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11087548\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11087548/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11087548/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
YOLOv9-Based Human Face Detection and Counting Under Human-Animal Faces, Complex Imaging Environments, and Image Qualities
Automatic human face detection and counting can play a vital role in the recognition and tracking of infant and adult faces in both outdoor and indoor human surveillance applications and facial-vital sign measurement. Despite the advancements in deep learning networks, accurate and reliable detection is still a challenging task in the presence of different kinds of objects, animal faces, and image characteristics. In this paper, we study the effectiveness of the YOLOv9-based face detection and counting method under skin color variations, different face sizes and number of faces, mixture of human and animal faces, image properties (brightness, contrast, illumination, glare), image qualities (image blurring, low-light facial images, lens flared images, grainy images taken at night, different types of noises), and pareidolia effects. The proposed method is trained and validated using the Wider Face database. In addition, we created image databases with different kinds of image qualities and image characteristics with the above-mentioned challenges. The YOLOv9-based face detection model achieves a precision of 86%, a recall of 62.8%, and a mean average precision of 70.8% at an inference time of 15.2 ms on the Wider Face database. Evaluation results demonstrate that the YOLOv9-based face counting outperforms most of the state-of-the-art face detection and people counting methods with a mean absolute error (MAE) of 3.36 and root mean square error (RMSE) of 22.38. The model was also deployed on the Raspberry Pi edge computing platform to study the real-time performance. The YOLOv9-based method achieves an MAE of 0.53-5.76 on the untrained infant database with Gaussian, salt and pepper, and speckle noises and an MAE of 0.43-2.87 on faces inside vehicles. The study further highlights the effectiveness of the YOLOv9 model in achieving promising face detection and counting results under a range of illumination and skin color variations. Evaluation results on a wide variety of animal faces and pareidolia-induced faces demonstrate more false positives due to the lack of contextual intelligence in the generation of deep-face models. Further, results show that the deep-face model detects artificial faces (statues, art, paintings, posters) if the model is deployed in uncontrolled face-based application environments. The performance of the model is degraded under different kinds of noise and blurred images. The results of this study highlight that performance can be improved by using noise-specific filtering techniques with optimal filtering parameters, but this requires the automatic identification of noise types and their corresponding parameters.
IEEE AccessCOMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍:
IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest.
IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on:
Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals.
Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering.
Development of new or improved fabrication or manufacturing techniques.
Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.