{"title":"面向无监督人类关键点检测的前景驱动对比学习","authors":"Shuxian Li;Hui Luo;Zhengwei Miao;Zhixing Wang;Qiliang Bao;Jianlin Zhang","doi":"10.1109/JPHOT.2025.3567754","DOIUrl":null,"url":null,"abstract":"Human keypoint detection has significant value in computer vision tasks such as human-machine interaction. Recently, unsupervised human keypoint detection has become prevalent due to concerns about data privacy. Most existing methods are based on a reconstruction process that extracts appearance and pose information from transformed image pairs and spatially aligns them to obtain a reconstructed image for detection. However, these methods suffer from an issue because they reconstruct the entire image, which can easily lead to some keypoints being assigned to the background region. In this work, we believe that focusing on independent reconstruction and detection of the foreground region can mitigate the above issue. To this end, we propose a novel unsupervised human keypoint detection scheme to achieve reliable detection, which focuses on reconstructing and detecting keypoints in the foreground. Specifically, we first use a segmentor to separate the foreground and background of the image, for reconstruction and detection to be done only on the foreground region. Considering that keypoints vary due to changes in appearance and pose, we then introduce the contrastive loss to expand the feature space and enhance the network's robustness. Depending on the insertion position of the segmentor, we differentiate the proposed scheme into two versions: the effective version and the efficient version. Experimental results on popular datasets show that the proposed method exhibits superior performance. Specifically, on the BBC Pose dataset, the effective version achieves a <inline-formula><tex-math>$\\mathbf{7.0\\%}$</tex-math></inline-formula> performance improvement. The efficient version leads to a <inline-formula><tex-math>$\\mathbf{5.7\\%}$</tex-math></inline-formula> performance enhancement without sacrificing the inference speed.","PeriodicalId":13204,"journal":{"name":"IEEE Photonics Journal","volume":"17 3","pages":"1-14"},"PeriodicalIF":2.4000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10989733","citationCount":"0","resultStr":"{\"title\":\"Foreground-Driven Contrastive Learning for Unsupervised Human Keypoint Detection\",\"authors\":\"Shuxian Li;Hui Luo;Zhengwei Miao;Zhixing Wang;Qiliang Bao;Jianlin Zhang\",\"doi\":\"10.1109/JPHOT.2025.3567754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human keypoint detection has significant value in computer vision tasks such as human-machine interaction. Recently, unsupervised human keypoint detection has become prevalent due to concerns about data privacy. Most existing methods are based on a reconstruction process that extracts appearance and pose information from transformed image pairs and spatially aligns them to obtain a reconstructed image for detection. However, these methods suffer from an issue because they reconstruct the entire image, which can easily lead to some keypoints being assigned to the background region. In this work, we believe that focusing on independent reconstruction and detection of the foreground region can mitigate the above issue. To this end, we propose a novel unsupervised human keypoint detection scheme to achieve reliable detection, which focuses on reconstructing and detecting keypoints in the foreground. Specifically, we first use a segmentor to separate the foreground and background of the image, for reconstruction and detection to be done only on the foreground region. Considering that keypoints vary due to changes in appearance and pose, we then introduce the contrastive loss to expand the feature space and enhance the network's robustness. Depending on the insertion position of the segmentor, we differentiate the proposed scheme into two versions: the effective version and the efficient version. Experimental results on popular datasets show that the proposed method exhibits superior performance. Specifically, on the BBC Pose dataset, the effective version achieves a <inline-formula><tex-math>$\\\\mathbf{7.0\\\\%}$</tex-math></inline-formula> performance improvement. The efficient version leads to a <inline-formula><tex-math>$\\\\mathbf{5.7\\\\%}$</tex-math></inline-formula> performance enhancement without sacrificing the inference speed.\",\"PeriodicalId\":13204,\"journal\":{\"name\":\"IEEE Photonics Journal\",\"volume\":\"17 3\",\"pages\":\"1-14\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10989733\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Photonics Journal\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10989733/\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Photonics Journal","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10989733/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Foreground-Driven Contrastive Learning for Unsupervised Human Keypoint Detection
Human keypoint detection has significant value in computer vision tasks such as human-machine interaction. Recently, unsupervised human keypoint detection has become prevalent due to concerns about data privacy. Most existing methods are based on a reconstruction process that extracts appearance and pose information from transformed image pairs and spatially aligns them to obtain a reconstructed image for detection. However, these methods suffer from an issue because they reconstruct the entire image, which can easily lead to some keypoints being assigned to the background region. In this work, we believe that focusing on independent reconstruction and detection of the foreground region can mitigate the above issue. To this end, we propose a novel unsupervised human keypoint detection scheme to achieve reliable detection, which focuses on reconstructing and detecting keypoints in the foreground. Specifically, we first use a segmentor to separate the foreground and background of the image, for reconstruction and detection to be done only on the foreground region. Considering that keypoints vary due to changes in appearance and pose, we then introduce the contrastive loss to expand the feature space and enhance the network's robustness. Depending on the insertion position of the segmentor, we differentiate the proposed scheme into two versions: the effective version and the efficient version. Experimental results on popular datasets show that the proposed method exhibits superior performance. Specifically, on the BBC Pose dataset, the effective version achieves a $\mathbf{7.0\%}$ performance improvement. The efficient version leads to a $\mathbf{5.7\%}$ performance enhancement without sacrificing the inference speed.
期刊介绍:
Breakthroughs in the generation of light and in its control and utilization have given rise to the field of Photonics, a rapidly expanding area of science and technology with major technological and economic impact. Photonics integrates quantum electronics and optics to accelerate progress in the generation of novel photon sources and in their utilization in emerging applications at the micro and nano scales spanning from the far-infrared/THz to the x-ray region of the electromagnetic spectrum. IEEE Photonics Journal is an online-only journal dedicated to the rapid disclosure of top-quality peer-reviewed research at the forefront of all areas of photonics. Contributions addressing issues ranging from fundamental understanding to emerging technologies and applications are within the scope of the Journal. The Journal includes topics in: Photon sources from far infrared to X-rays, Photonics materials and engineered photonic structures, Integrated optics and optoelectronic, Ultrafast, attosecond, high field and short wavelength photonics, Biophotonics, including DNA photonics, Nanophotonics, Magnetophotonics, Fundamentals of light propagation and interaction; nonlinear effects, Optical data storage, Fiber optics and optical communications devices, systems, and technologies, Micro Opto Electro Mechanical Systems (MOEMS), Microwave photonics, Optical Sensors.