Le Han;Kaixuan Chen;Lei Zhao;Yangbo Jiang;Pengfei Wang;Nenggan Zheng
{"title":"Cross-Domain Animal Pose Estimation With Skeleton Anomaly-Aware Learning","authors":"Le Han;Kaixuan Chen;Lei Zhao;Yangbo Jiang;Pengfei Wang;Nenggan Zheng","doi":"10.1109/TCSVT.2025.3557844","DOIUrl":null,"url":null,"abstract":"Animal pose estimation is often constrained by the scarcity of annotations and the diversity of scenarios and species. The pseudo-label generation based unsupervised domain adaptation paradigm, which discriminates the predicted keypoints of unlabeled data based on the skeleton position consistency, has demonstrated effectiveness for such problems. However, existing methods generate pseudo-labels with massive false positives, because they cannot effectively distinguish sample pairs with the same errors. In this study, we propose a cross-domain animal pose estimation model from a novel perspective of skeleton anomaly learning. We construct a graph contrastive learning mechanism to acquire the skeleton anomaly-aware knowledge, which enables the generation of accurate pseudo-labels for target domain and imposes graph constraint on unlabeled data. And a skeleton anomaly-feedback based domain adaptation framework is designed to facilitate implicit alignment of object-specific features and joint training of cross-domain. Besides, we propose a novel rat pose dataset named UDARP-9.4K to address the gap of small-sized animal pose datasets encompassing diverse experimental scenarios. The related datasets are reviewed and evaluated in detail. Extensive experiments are conducted on UDARP-9.4K and two public datasets to demonstrate the superiority of the proposed model in cross-scenarios and cross-species animal pose estimation tasks. Further analysis reveals the effectiveness of the proposed model for skeleton structure feature learning. <italic>The UDARP-9.4K dataset is available here</i> <uri>https://github.com/CSDLLab/UDARP-9.4K-Dataset</uri>.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 9","pages":"9148-9160"},"PeriodicalIF":11.1000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10949647/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Animal pose estimation is often constrained by the scarcity of annotations and the diversity of scenarios and species. The pseudo-label generation based unsupervised domain adaptation paradigm, which discriminates the predicted keypoints of unlabeled data based on the skeleton position consistency, has demonstrated effectiveness for such problems. However, existing methods generate pseudo-labels with massive false positives, because they cannot effectively distinguish sample pairs with the same errors. In this study, we propose a cross-domain animal pose estimation model from a novel perspective of skeleton anomaly learning. We construct a graph contrastive learning mechanism to acquire the skeleton anomaly-aware knowledge, which enables the generation of accurate pseudo-labels for target domain and imposes graph constraint on unlabeled data. And a skeleton anomaly-feedback based domain adaptation framework is designed to facilitate implicit alignment of object-specific features and joint training of cross-domain. Besides, we propose a novel rat pose dataset named UDARP-9.4K to address the gap of small-sized animal pose datasets encompassing diverse experimental scenarios. The related datasets are reviewed and evaluated in detail. Extensive experiments are conducted on UDARP-9.4K and two public datasets to demonstrate the superiority of the proposed model in cross-scenarios and cross-species animal pose estimation tasks. Further analysis reveals the effectiveness of the proposed model for skeleton structure feature learning. The UDARP-9.4K dataset is available here https://github.com/CSDLLab/UDARP-9.4K-Dataset.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.