Transformer-based similarity learning for re-identification of chickens

IF 6.3 Q1 AGRICULTURAL ENGINEERING

Smart agricultural technology Pub Date : 2025-04-10 DOI:10.1016/j.atech.2025.100945

Christian Lamping, Gert Kootstra, Marjolein Derks

{"title":"Transformer-based similarity learning for re-identification of chickens","authors":"Christian Lamping, Gert Kootstra, Marjolein Derks","doi":"10.1016/j.atech.2025.100945","DOIUrl":null,"url":null,"abstract":"<div><div>Continuous animal monitoring relies heavily on the ability to re-identify individuals over time, essential for both short-term tracking, such as video analysis, and long-term monitoring of animal conditions. Traditionally, livestock re-identification is approached using tags or sensors, which require additional handling effort and potentially impact animal welfare. In response to these limitations, non-invasive vision-based approaches have emerged recently, with existing research primarily focusing on the re-identification of pigs and cows. Re-identification of chickens, which exhibit high uniformity and are housed in larger groups, remains challenging and has received less research attention. This study addresses this gap by exploring the feasibility of re-identifying individual laying hens within uncontrolled farm environments using images of their heads. It proposes the first similarity-learning approach based on a VisionTransformer architecture to re-identify chickens without requiring training images for each individual bird. In our experiments, we compared the transformer-based approach to traditional CNN architectures while assessing the impact of different model sizes and triplet mining strategies during training. Moreover, we evaluated practical applicability by analyzing the effects of the number of images per chicken and overall population size on re-identification accuracy. Finally, we examined which visual features of the chicken head were most relevant for re-identification. Results show Top-1 accuracies exceeding 80 % for small groups and maintaining over 40 % accuracy for a population of 100 chickens. Moreover, it was shown that the transformer-based architecture outperformed CNN models, with the use of semi-hard negative samples during training yielding the best results. Furthermore, it was revealed that the evaluated models learned to prioritize features such as the comb, wattles, and ear lobes, often aligning with human perception. These results demonstrate promising potential for re-identifying chickens even when recorded in an uncontrolled farm environment, providing a foundation for future applications in animal tracking and monitoring.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"11 ","pages":"Article 100945"},"PeriodicalIF":6.3000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525001789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Continuous animal monitoring relies heavily on the ability to re-identify individuals over time, essential for both short-term tracking, such as video analysis, and long-term monitoring of animal conditions. Traditionally, livestock re-identification is approached using tags or sensors, which require additional handling effort and potentially impact animal welfare. In response to these limitations, non-invasive vision-based approaches have emerged recently, with existing research primarily focusing on the re-identification of pigs and cows. Re-identification of chickens, which exhibit high uniformity and are housed in larger groups, remains challenging and has received less research attention. This study addresses this gap by exploring the feasibility of re-identifying individual laying hens within uncontrolled farm environments using images of their heads. It proposes the first similarity-learning approach based on a VisionTransformer architecture to re-identify chickens without requiring training images for each individual bird. In our experiments, we compared the transformer-based approach to traditional CNN architectures while assessing the impact of different model sizes and triplet mining strategies during training. Moreover, we evaluated practical applicability by analyzing the effects of the number of images per chicken and overall population size on re-identification accuracy. Finally, we examined which visual features of the chicken head were most relevant for re-identification. Results show Top-1 accuracies exceeding 80 % for small groups and maintaining over 40 % accuracy for a population of 100 chickens. Moreover, it was shown that the transformer-based architecture outperformed CNN models, with the use of semi-hard negative samples during training yielding the best results. Furthermore, it was revealed that the evaluated models learned to prioritize features such as the comb, wattles, and ear lobes, often aligning with human perception. These results demonstrate promising potential for re-identifying chickens even when recorded in an uncontrolled farm environment, providing a foundation for future applications in animal tracking and monitoring.

查看原文本刊更多论文

基于变压器的相似性学习方法用于鸡的再识别

持续的动物监测在很大程度上依赖于随着时间的推移重新识别个体的能力，这对于短期跟踪（如视频分析）和长期监测动物状况都至关重要。传统上，牲畜的重新识别是使用标签或传感器进行的，这需要额外的处理工作，并可能影响动物福利。为了应对这些限制，最近出现了基于非侵入性视觉的方法，现有的研究主要集中在猪和牛的重新识别上。鸡表现出高度的统一性，饲养在更大的群体中，对鸡的重新鉴定仍然具有挑战性，并且得到的研究关注较少。本研究通过探索利用其头部图像在不受控制的农场环境中重新识别单个蛋鸡的可行性，解决了这一差距。它提出了第一种基于VisionTransformer架构的相似性学习方法，可以在不需要对每只鸡进行训练图像的情况下重新识别鸡。在我们的实验中，我们将基于变压器的方法与传统CNN架构进行了比较，同时评估了不同模型大小和三元组挖掘策略在训练期间的影响。此外，我们通过分析每只鸡的图像数量和总体种群大小对重新识别准确性的影响来评估实际适用性。最后，我们研究了鸡头的哪些视觉特征与重新识别最相关。结果显示，Top-1的准确率在小群体中超过80%，在100只鸡的群体中保持40%以上的准确率。此外，研究表明，基于变压器的体系结构优于CNN模型，在训练过程中使用半硬负样本产生最佳效果。此外，被评估的模型学会了优先考虑梳子、鸡垂和耳垂等特征，通常与人类的感知一致。这些结果表明，即使在不受控制的农场环境中进行记录，也有可能重新识别鸡，为未来在动物跟踪和监测方面的应用奠定基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Smart agricultural technology

CiteScore

4.20

自引率

0.00%

发文量