Saisai Wu , Shuqing Han , Xiaoxiang Mo , Yingying Wei , Yuanyuan Qin , He Chen , Jianzhai Wu , Zhikang Zeng
{"title":"A top-down deep neural network for multi-dairy cows pose estimation and lameness detection","authors":"Saisai Wu , Shuqing Han , Xiaoxiang Mo , Yingying Wei , Yuanyuan Qin , He Chen , Jianzhai Wu , Zhikang Zeng","doi":"10.1016/j.compag.2025.110911","DOIUrl":null,"url":null,"abstract":"<div><div>Cow pose estimation and real-time health monitoring are important for refined herd management, improved animal welfare, and reduced passive culling rates. However, existing multi-object pose estimation methods often struggle to adapt to multi-scale objects in complex environments and typically exhibit low accuracy in detecting occluded keypoints. To address these challenges, this study proposes a top-down deep neural network for multi-dairy cows pose estimation and lameness detection, which integrates lightweight object detection, multi-scale feature fusion, and comprehensive motion feature analysis to improve the robustness under complex farm conditions. First, the real-time object detector YOLOv8n is improved by introducing the Partial Convolution (PConv) and Slim-neck modules, which improve both the efficiency and accuracy of object bounding box predictions, providing a solid foundation for the subsequent pose estimation. Second, a Path Aggregation Feature Pyramid Network (PAFPN)-based multi-scale feature fusion module is introduced as the neck network within the Real-time Multi-person Pose Estimation (RTMPose). This is further supported by a transfer learning strategy to improve keypoint localization, particularly under-occlusion and scale variation conditions. The experimental results show that the improved model achieves a mean average precision (<em>mAP)</em> of 95.8 %, significantly outperforming the baseline model and other existing algorithms. Seven motion features, including gait symmetry, head swing amplitude, and back curvature, were extracted in real time through pose tracking and motion trajectory analysis. These features were normalized and input into a Random Forest classifier for lameness detection. The model was evaluated on a dataset of 418 dairy cows and achieved average accuracy, sensitivity, and specificity values of 93.8 %, 94.4 %, and 97.5 %, respectively. These results demonstrate that combining multiple motion features provides a more accurate assessment of lameness.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"239 ","pages":"Article 110911"},"PeriodicalIF":8.9000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925010178","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Cow pose estimation and real-time health monitoring are important for refined herd management, improved animal welfare, and reduced passive culling rates. However, existing multi-object pose estimation methods often struggle to adapt to multi-scale objects in complex environments and typically exhibit low accuracy in detecting occluded keypoints. To address these challenges, this study proposes a top-down deep neural network for multi-dairy cows pose estimation and lameness detection, which integrates lightweight object detection, multi-scale feature fusion, and comprehensive motion feature analysis to improve the robustness under complex farm conditions. First, the real-time object detector YOLOv8n is improved by introducing the Partial Convolution (PConv) and Slim-neck modules, which improve both the efficiency and accuracy of object bounding box predictions, providing a solid foundation for the subsequent pose estimation. Second, a Path Aggregation Feature Pyramid Network (PAFPN)-based multi-scale feature fusion module is introduced as the neck network within the Real-time Multi-person Pose Estimation (RTMPose). This is further supported by a transfer learning strategy to improve keypoint localization, particularly under-occlusion and scale variation conditions. The experimental results show that the improved model achieves a mean average precision (mAP) of 95.8 %, significantly outperforming the baseline model and other existing algorithms. Seven motion features, including gait symmetry, head swing amplitude, and back curvature, were extracted in real time through pose tracking and motion trajectory analysis. These features were normalized and input into a Random Forest classifier for lameness detection. The model was evaluated on a dataset of 418 dairy cows and achieved average accuracy, sensitivity, and specificity values of 93.8 %, 94.4 %, and 97.5 %, respectively. These results demonstrate that combining multiple motion features provides a more accurate assessment of lameness.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.