{"title":"基于时空特征学习的视频鸡行为识别与定位","authors":"Yilei Hu , Jinyang Xu , Zhichao Gou , Di Cui","doi":"10.1016/j.aiia.2025.06.006","DOIUrl":null,"url":null,"abstract":"<div><div>Timely acquisition of chicken behavioral information is crucial for assessing chicken health status and production performance. Video-based behavior recognition has emerged as a primary technique for obtaining such information due to its accuracy and robustness. Video-based models generally predict a single behavior from a single video segment of a fixed duration. However, during periods of high activity in poultry, behavior transition may occur within a video segment, and existing models often fail to capture such transitions effectively. This limitation highlights the insufficient temporal resolution of video-based behavior recognition models. This study presents a chicken behavior recognition and localization model, CBLFormer, which is based on spatiotemporal feature learning. The model was designed to recognize behaviors that occur before and after transitions in video segments and to localize the corresponding time interval for each behavior. An improved transformer block, the cascade encoder-decoder network (CEDNet), a transformer-based head, and weighted distance intersection over union (WDIoU) loss were integrated into CBLFormer to enhance the model's ability to distinguish between different behavior categories and locate behavior boundaries. For the training and testing of CBLFormer, a dataset was created by collecting videos from 320 chickens across different ages and rearing densities. The results showed that CBLFormer achieved a [email protected]:0.95 of 98.34 % on the test set. The integration of CEDNet contributed the most to the performance improvement of CBLFormer. The visualization results confirmed that the model effectively captured the behavioral boundaries of chickens and correctly recognized behavior categories. The transfer learning results demonstrated that the model is applicable to chicken behavior recognition and localization tasks in real-world poultry farms. The proposed method handles cases where poultry behavior transitions occur within the video segment and improves the temporal resolution of video-based behavior recognition models.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 816-828"},"PeriodicalIF":8.2000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recognizing and localizing chicken behaviors in videos based on spatiotemporal feature learning\",\"authors\":\"Yilei Hu , Jinyang Xu , Zhichao Gou , Di Cui\",\"doi\":\"10.1016/j.aiia.2025.06.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Timely acquisition of chicken behavioral information is crucial for assessing chicken health status and production performance. Video-based behavior recognition has emerged as a primary technique for obtaining such information due to its accuracy and robustness. Video-based models generally predict a single behavior from a single video segment of a fixed duration. However, during periods of high activity in poultry, behavior transition may occur within a video segment, and existing models often fail to capture such transitions effectively. This limitation highlights the insufficient temporal resolution of video-based behavior recognition models. This study presents a chicken behavior recognition and localization model, CBLFormer, which is based on spatiotemporal feature learning. The model was designed to recognize behaviors that occur before and after transitions in video segments and to localize the corresponding time interval for each behavior. An improved transformer block, the cascade encoder-decoder network (CEDNet), a transformer-based head, and weighted distance intersection over union (WDIoU) loss were integrated into CBLFormer to enhance the model's ability to distinguish between different behavior categories and locate behavior boundaries. For the training and testing of CBLFormer, a dataset was created by collecting videos from 320 chickens across different ages and rearing densities. The results showed that CBLFormer achieved a [email protected]:0.95 of 98.34 % on the test set. The integration of CEDNet contributed the most to the performance improvement of CBLFormer. The visualization results confirmed that the model effectively captured the behavioral boundaries of chickens and correctly recognized behavior categories. The transfer learning results demonstrated that the model is applicable to chicken behavior recognition and localization tasks in real-world poultry farms. The proposed method handles cases where poultry behavior transitions occur within the video segment and improves the temporal resolution of video-based behavior recognition models.</div></div>\",\"PeriodicalId\":52814,\"journal\":{\"name\":\"Artificial Intelligence in Agriculture\",\"volume\":\"15 4\",\"pages\":\"Pages 816-828\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2025-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence in Agriculture\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589721725000698\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Agriculture","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589721725000698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
Recognizing and localizing chicken behaviors in videos based on spatiotemporal feature learning
Timely acquisition of chicken behavioral information is crucial for assessing chicken health status and production performance. Video-based behavior recognition has emerged as a primary technique for obtaining such information due to its accuracy and robustness. Video-based models generally predict a single behavior from a single video segment of a fixed duration. However, during periods of high activity in poultry, behavior transition may occur within a video segment, and existing models often fail to capture such transitions effectively. This limitation highlights the insufficient temporal resolution of video-based behavior recognition models. This study presents a chicken behavior recognition and localization model, CBLFormer, which is based on spatiotemporal feature learning. The model was designed to recognize behaviors that occur before and after transitions in video segments and to localize the corresponding time interval for each behavior. An improved transformer block, the cascade encoder-decoder network (CEDNet), a transformer-based head, and weighted distance intersection over union (WDIoU) loss were integrated into CBLFormer to enhance the model's ability to distinguish between different behavior categories and locate behavior boundaries. For the training and testing of CBLFormer, a dataset was created by collecting videos from 320 chickens across different ages and rearing densities. The results showed that CBLFormer achieved a [email protected]:0.95 of 98.34 % on the test set. The integration of CEDNet contributed the most to the performance improvement of CBLFormer. The visualization results confirmed that the model effectively captured the behavioral boundaries of chickens and correctly recognized behavior categories. The transfer learning results demonstrated that the model is applicable to chicken behavior recognition and localization tasks in real-world poultry farms. The proposed method handles cases where poultry behavior transitions occur within the video segment and improves the temporal resolution of video-based behavior recognition models.