Digital Sentience? Evaluating the Integration of AI-Driven Tools in Animal Welfare Assessment

Animal Research and One Health Pub Date : 2025-05-13 DOI:10.1002/aro2.70018

Sara Platto

{"title":"Digital Sentience? Evaluating the Integration of AI-Driven Tools in Animal Welfare Assessment","authors":"Sara Platto","doi":"10.1002/aro2.70018","DOIUrl":null,"url":null,"abstract":"Despite significant advancements in the field of animal welfare, its assessment still remains a methodological challenge, as an animal's affective state cannot always be directly measured, but must instead be inferred from behavioral, physiological, environmental, and nutritional indicators [1, 2]. This constraint has led to the exploration of artificial intelligence (AI)—driven tools—including machine learning (ML), computer vision, and sensor-based systems—as possible resources to facilitate dynamic, real-time welfare assessments, and predictive analytic [3, 4]. For example, AI-driven wearable sensors facilitate early detection of stress and disease by continuously monitoring vital signs, and behavioral patterns in cattle, pigs, and poultry [5], whereas machine learning can optimize feeding regimes, and identifies health conditions such as lameness [6]. In wildlife conservation, AI-enhanced technologies—including unmanned aerial vehicles (UAVs), thermal imaging, and acoustic monitoring—enable detailed tracking of animal movements, habitat use, and identification of anthropogenic threats such as poaching [7, 8]. AI applications are also emerging within zoological institutions, where neural networks and wearable sensors are employed to gather behavioral, and physiological data of captive animals, supporting their comprehensive welfare assessments [9, 10]. In the field of companion animals' care, AI innovations have advanced diagnostics, cancer screening, and real-time health monitoring through IoT (Internet of Thing)—enabled collars [11, 12]. AI is also making a significant impact in laboratory environments, where it supports the 3 Rs by reducing animals use through predictive toxicology frameworks such as the ONTOX project [13]. Additionally, automated husbandry systems employing AI are considered to be implemented to minimize human–animal interactions, thus reducing stress associated with handling [14].Although artificial intelligence (AI) presents promising opportunities to identify how animals perceive and experience their own well-being—its integration into the animal welfare science remains limited [15]. This constraint is largely attributed to persistent practical, conceptual, and technical challenges that limit the widespread application, and translation of AI-based models in real-world animal welfare contexts [16].A central technical constraint in AI implementation for animal welfare is the requirement for large, labeled datasets to train the algorithms [17]. Most deep learning models demand substantial volumes of high-quality, labeled data to achieve high accuracy in the performance, particularly for behavioral assessments [18]. Studies estimate that up to 1000 samples per behavioral class may be necessary for an accurate baseline classification, with some models requiring significantly more data depending on network complexity, and task specificity [17]. The reliance on large volume of data by these AI-tools imposes a resource-intensive task on the researchers to collect the data and label them [19]. The requirement of a large amount of data is also supported by studies that showed that when a sample size increases, the effect sizes and classification accuracy also increases, providing a datasets with high discriminatory power between behavioral classes [17]. Therefore, this leads to a conundrum where the pursuit of a large sample size for better accuracy must be balanced against the practicalities, and costs of data collection [20, 21]. To address this challenge, researchers have begun exploring semi-supervised learning approaches, in which large volumes of unlabeled data are combined with smaller, labeled datasets to improve model performance [22]. Although this method shows promise, its effectiveness is often limited by issues such as distribution mismatch, where the characteristics of the labeled and unlabeled data differ significantly [23].In addition, properly labeled datasets are highly valuable, since they can enhance the validation process of AI applications in practice. This can lead to issues related to labeled data sharing, especially if a finished product is to be marketed [19]. In fact, the substantial logistical, and ethical barriers to the integration of data across decentralized systems such as farms, laboratories, or universities still constitute a huge concern, specifically when ownership, standardization, privacy, and security are involved [24]. This situation can also cause a deterrent for the development of proper transdisciplinary collaborations, which represents an essential step toward broader adoption of AI in animal welfare [25]. Furthermore, the achievement of a co-development of AI tools among experts from different fields is often challenged by divergent understandings of the core concepts of “animal welfare” [25], where each discipline may bring its own interpretation of the subject: AI engineers may emphasize measurable outputs, whereas veterinarians prioritize health, natural behaviors, and affective states [26, 27]. In order to overcome this obstacles, Fogel and Kvedar [28] suggested to use a three-tier validation framework, often used in medical research collaborations, that includes (1) verification, (2) analytical validation, and (3) application-specific validation, to improve reliability and foster mutual understanding across disciplines.Another issue is the lack of contextual generalization in the AI tools [29]. For example, most machine vision research papers use data from one group of animals to train deep learning neural networks to recognize their behaviors, but its capacity to work with other groups of animals, in different locations, under different light conditions is never tested, and the next paper rarely builds on the last one [29, 30]. This issue is particularly pronounced in the field of animal behavior, where two studies are not alike due to differences in species, environments, and behavioral definitions, making it difficult for models to generalize reliably across datasets [24, 31]. To address the lack of contextual generalization will require field-based testing, and the development of species- and context-specific benchmarks, ideally co-designed with end-user (animals) in mind to ensure usability [32]. The lack of contextual generalization can lead to the lack of systematic validation of AI tools in welfare contexts, as reported in a recent review, where only 5% of precision livestock farming technologies for pigs have been validated under practical farm conditions before market release [32]. This validation gap could undermine user confidence, and slows adoption of AI-tools within the animal welfare field.Moreover, the amount of data set used to train the algorithms is also limited by infrastructure limitations [29]. Specifically, handling large datasets demands substantial computing power and reliable storage capacity—resources that frequently exceed the capabilities of standard research facilities' equipment [29]. In addition, many AI tools demand coding knowledge or familiarity with command-line environments, limiting accessibility to those with technical expertise [33]. This excludes a large proportion of animal welfare practitioners, such as veterinarians, farmers, and zookeepers [29]. Although platforms such as DeepLabCut, AniPose, and DeepEthogram have begun addressing this issue by offering more intuitive interfaces for the assessment of animals' movements and behaviors' classification, these tools still require refinement for broader usability [34]. Furthermore, concerns related to the safety of the animals is also raised in relation to the use of wearable technologies, frequently used for animal welfare monitoring [34]. Precisely, poorly designed sensors may restrict animals' movement, or disrupt social interactions, leading to altered behaviors and possible injuries [35-37]. Similarly, many tools developed for dairy cattle are repurposed for other livestock species without sufficient validation, resulting in welfare concerns and performance deficits [33]. Additionally, wearable sensor tags often face limitations in real-time data transmission, particularly in rural or open-space environments where connectivity may be unreliable. These interruptions can result in significant data gaps, undermining continuous monitoring efforts [37]. Moreover, trade-offs between battery life, sensor size, and device weight pose additional design challenges, frequently compromising either the durability of the device or the frequency and resolution of data collection [38].Finally, ethical implications must not be overlooked [35]. The automation of animal welfare assessment through AI has raised legitimate concerns about its impact on the human–animal relationship—a cornerstone of animal welfare [39]. Over-reliance on technology may lead to workforce deskilling, reduced focus on the animals' needs with the risks to increase its objectification [35, 36]. Therefore, the use of AI in settings involving sentient beings must be approached with transparency, accountability, and a framework of responsible innovation that considers broader societal values [35]. These concerns align closely with the principles of One Welfare, which emphasize the interdependence of animal, human, and environmental well-being in decision-making [40].In conclusion, the challenges and limitations associated with AI-driven tools in the field of animal welfare should not be perceived only as shortcomings of the technology, but rather as opportunities for refinement and innovation [16]. These issues—whether technical, ethical, or operational—highlight the need for interdisciplinary collaboration, transparent data-sharing, and context-species specific validations of the technologies [24, 25, 32]. Overcoming these challenges will be essential to ensure that AI technologies are not only scientifically robust, but also ethically aligned with the broader goals of the One Welfare paradigm [41].Sara Platto: conceptualization, writing – original draft, writing – review and editing.The author declares no conflicts of interest.","PeriodicalId":100086,"journal":{"name":"Animal Research and One Health","volume":"3 3","pages":"344-347"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aro2.70018","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Research and One Health","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aro2.70018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Despite significant advancements in the field of animal welfare, its assessment still remains a methodological challenge, as an animal's affective state cannot always be directly measured, but must instead be inferred from behavioral, physiological, environmental, and nutritional indicators [1, 2]. This constraint has led to the exploration of artificial intelligence (AI)—driven tools—including machine learning (ML), computer vision, and sensor-based systems—as possible resources to facilitate dynamic, real-time welfare assessments, and predictive analytic [3, 4]. For example, AI-driven wearable sensors facilitate early detection of stress and disease by continuously monitoring vital signs, and behavioral patterns in cattle, pigs, and poultry [5], whereas machine learning can optimize feeding regimes, and identifies health conditions such as lameness [6]. In wildlife conservation, AI-enhanced technologies—including unmanned aerial vehicles (UAVs), thermal imaging, and acoustic monitoring—enable detailed tracking of animal movements, habitat use, and identification of anthropogenic threats such as poaching [7, 8]. AI applications are also emerging within zoological institutions, where neural networks and wearable sensors are employed to gather behavioral, and physiological data of captive animals, supporting their comprehensive welfare assessments [9, 10]. In the field of companion animals' care, AI innovations have advanced diagnostics, cancer screening, and real-time health monitoring through IoT (Internet of Thing)—enabled collars [11, 12]. AI is also making a significant impact in laboratory environments, where it supports the 3 Rs by reducing animals use through predictive toxicology frameworks such as the ONTOX project [13]. Additionally, automated husbandry systems employing AI are considered to be implemented to minimize human–animal interactions, thus reducing stress associated with handling [14].

Although artificial intelligence (AI) presents promising opportunities to identify how animals perceive and experience their own well-being—its integration into the animal welfare science remains limited [15]. This constraint is largely attributed to persistent practical, conceptual, and technical challenges that limit the widespread application, and translation of AI-based models in real-world animal welfare contexts [16].

A central technical constraint in AI implementation for animal welfare is the requirement for large, labeled datasets to train the algorithms [17]. Most deep learning models demand substantial volumes of high-quality, labeled data to achieve high accuracy in the performance, particularly for behavioral assessments [18]. Studies estimate that up to 1000 samples per behavioral class may be necessary for an accurate baseline classification, with some models requiring significantly more data depending on network complexity, and task specificity [17]. The reliance on large volume of data by these AI-tools imposes a resource-intensive task on the researchers to collect the data and label them [19]. The requirement of a large amount of data is also supported by studies that showed that when a sample size increases, the effect sizes and classification accuracy also increases, providing a datasets with high discriminatory power between behavioral classes [17]. Therefore, this leads to a conundrum where the pursuit of a large sample size for better accuracy must be balanced against the practicalities, and costs of data collection [20, 21]. To address this challenge, researchers have begun exploring semi-supervised learning approaches, in which large volumes of unlabeled data are combined with smaller, labeled datasets to improve model performance [22]. Although this method shows promise, its effectiveness is often limited by issues such as distribution mismatch, where the characteristics of the labeled and unlabeled data differ significantly [23].

In addition, properly labeled datasets are highly valuable, since they can enhance the validation process of AI applications in practice. This can lead to issues related to labeled data sharing, especially if a finished product is to be marketed [19]. In fact, the substantial logistical, and ethical barriers to the integration of data across decentralized systems such as farms, laboratories, or universities still constitute a huge concern, specifically when ownership, standardization, privacy, and security are involved [24]. This situation can also cause a deterrent for the development of proper transdisciplinary collaborations, which represents an essential step toward broader adoption of AI in animal welfare [25]. Furthermore, the achievement of a co-development of AI tools among experts from different fields is often challenged by divergent understandings of the core concepts of “animal welfare” [25], where each discipline may bring its own interpretation of the subject: AI engineers may emphasize measurable outputs, whereas veterinarians prioritize health, natural behaviors, and affective states [26, 27]. In order to overcome this obstacles, Fogel and Kvedar [28] suggested to use a three-tier validation framework, often used in medical research collaborations, that includes (1) verification, (2) analytical validation, and (3) application-specific validation, to improve reliability and foster mutual understanding across disciplines.

Another issue is the lack of contextual generalization in the AI tools [29]. For example, most machine vision research papers use data from one group of animals to train deep learning neural networks to recognize their behaviors, but its capacity to work with other groups of animals, in different locations, under different light conditions is never tested, and the next paper rarely builds on the last one [29, 30]. This issue is particularly pronounced in the field of animal behavior, where two studies are not alike due to differences in species, environments, and behavioral definitions, making it difficult for models to generalize reliably across datasets [24, 31]. To address the lack of contextual generalization will require field-based testing, and the development of species- and context-specific benchmarks, ideally co-designed with end-user (animals) in mind to ensure usability [32]. The lack of contextual generalization can lead to the lack of systematic validation of AI tools in welfare contexts, as reported in a recent review, where only 5% of precision livestock farming technologies for pigs have been validated under practical farm conditions before market release [32]. This validation gap could undermine user confidence, and slows adoption of AI-tools within the animal welfare field.

Moreover, the amount of data set used to train the algorithms is also limited by infrastructure limitations [29]. Specifically, handling large datasets demands substantial computing power and reliable storage capacity—resources that frequently exceed the capabilities of standard research facilities' equipment [29]. In addition, many AI tools demand coding knowledge or familiarity with command-line environments, limiting accessibility to those with technical expertise [33]. This excludes a large proportion of animal welfare practitioners, such as veterinarians, farmers, and zookeepers [29]. Although platforms such as DeepLabCut, AniPose, and DeepEthogram have begun addressing this issue by offering more intuitive interfaces for the assessment of animals' movements and behaviors' classification, these tools still require refinement for broader usability [34]. Furthermore, concerns related to the safety of the animals is also raised in relation to the use of wearable technologies, frequently used for animal welfare monitoring [34]. Precisely, poorly designed sensors may restrict animals' movement, or disrupt social interactions, leading to altered behaviors and possible injuries [35-37]. Similarly, many tools developed for dairy cattle are repurposed for other livestock species without sufficient validation, resulting in welfare concerns and performance deficits [33]. Additionally, wearable sensor tags often face limitations in real-time data transmission, particularly in rural or open-space environments where connectivity may be unreliable. These interruptions can result in significant data gaps, undermining continuous monitoring efforts [37]. Moreover, trade-offs between battery life, sensor size, and device weight pose additional design challenges, frequently compromising either the durability of the device or the frequency and resolution of data collection [38].

Finally, ethical implications must not be overlooked [35]. The automation of animal welfare assessment through AI has raised legitimate concerns about its impact on the human–animal relationship—a cornerstone of animal welfare [39]. Over-reliance on technology may lead to workforce deskilling, reduced focus on the animals' needs with the risks to increase its objectification [35, 36]. Therefore, the use of AI in settings involving sentient beings must be approached with transparency, accountability, and a framework of responsible innovation that considers broader societal values [35]. These concerns align closely with the principles of One Welfare, which emphasize the interdependence of animal, human, and environmental well-being in decision-making [40].

In conclusion, the challenges and limitations associated with AI-driven tools in the field of animal welfare should not be perceived only as shortcomings of the technology, but rather as opportunities for refinement and innovation [16]. These issues—whether technical, ethical, or operational—highlight the need for interdisciplinary collaboration, transparent data-sharing, and context-species specific validations of the technologies [24, 25, 32]. Overcoming these challenges will be essential to ensure that AI technologies are not only scientifically robust, but also ethically aligned with the broader goals of the One Welfare paradigm [41].

Sara Platto: conceptualization, writing – original draft, writing – review and editing.

The author declares no conflicts of interest.

查看原文本刊更多论文

数字感觉?评估人工智能驱动工具在动物福利评估中的整合

尽管动物福利领域取得了重大进展，但其评估仍然是一个方法论上的挑战，因为动物的情感状态不能总是直接测量，而必须从行为、生理、环境和营养指标中推断[1,2]。这一限制促使人们探索人工智能（AI）驱动的工具——包括机器学习（ML）、计算机视觉和基于传感器的系统——作为促进动态、实时福利评估和预测分析的可能资源[3,4]。例如，人工智能驱动的可穿戴传感器通过持续监测牛、猪和家禽的生命体征和行为模式，有助于早期发现压力和疾病，而机器学习可以优化喂养方案，并识别跛行等健康状况。在野生动物保护中，人工智能增强的技术——包括无人机（uav）、热成像和声学监测——可以详细跟踪动物的运动、栖息地的使用，并识别偷猎等人为威胁[7,8]。人工智能应用也出现在动物机构中，神经网络和可穿戴传感器被用来收集圈养动物的行为和生理数据，支持它们的综合福利评估[9,10]。在伴侣动物护理领域，人工智能创新通过物联网项圈实现了先进的诊断、癌症筛查和实时健康监测[11,12]。人工智能也对实验室环境产生了重大影响，它通过预测毒理学框架（如ONTOX项目[13]）减少动物使用，从而支持3r。此外，采用人工智能的自动化饲养系统被认为是为了最大限度地减少人与动物的互动，从而减少与处理[14]相关的压力。尽管人工智能（AI）为确定动物如何感知和体验自己的幸福提供了有希望的机会，但它与动物福利科学的结合仍然有限。这种限制很大程度上归因于持续的实践、概念和技术挑战，这些挑战限制了基于人工智能的模型在现实世界动物福利背景下的广泛应用和翻译[10]。动物福利人工智能实施的一个核心技术限制是需要大型标记数据集来训练算法[17]。大多数深度学习模型需要大量高质量的标记数据来实现高准确性，特别是在行为评估方面。研究估计，每个行为类可能需要多达1000个样本来进行准确的基线分类，根据网络复杂性和任务特异性，一些模型需要更多的数据。这些人工智能工具对大量数据的依赖给研究人员带来了一项资源密集型任务，即收集数据并将其标记为[19]。大量数据的要求也得到了研究的支持，研究表明，当样本量增加时，效应量和分类精度也会增加，从而提供了行为类别之间具有高区分力的数据集[17]。因此，这导致了一个难题，即追求更大的样本量以获得更好的准确性，必须与数据收集的实用性和成本相平衡[20,21]。为了应对这一挑战，研究人员已经开始探索半监督学习方法，将大量未标记的数据与较小的标记数据集相结合，以提高模型的性能。尽管这种方法显示出前景，但其有效性经常受到分布不匹配等问题的限制，其中标记和未标记数据的特征差异很大。此外，适当标记的数据集非常有价值，因为它们可以在实践中增强人工智能应用的验证过程。这可能会导致与标签数据共享相关的问题，特别是如果成品要在市场上销售的话。事实上，跨分散系统（如农场、实验室或大学）整合数据的大量后勤和道德障碍仍然是一个巨大的问题，特别是当涉及所有权、标准化、隐私和安全时。这种情况也会阻碍适当的跨学科合作的发展，这是在动物福利领域更广泛采用人工智能的重要一步。此外，来自不同领域的专家之间共同开发人工智能工具的成就经常受到对“动物福利”核心概念的不同理解的挑战，其中每个学科可能会带来自己对该主题的解释：人工智能工程师可能强调可测量的输出，而兽医则优先考虑健康，自然行为和情感状态[26,27]。为了克服这些障碍，Fogel和Kvedar建议使用一个三层验证框架，通常用于医学研究合作，包括(1)验证，(2)分析验证和(3)特定应用验证，以提高可靠性并促进跨学科的相互理解。另一个问题是AI工具[29]缺乏上下文泛化。例如，大多数机器视觉研究论文使用来自一组动物的数据来训练深度学习神经网络来识别它们的行为，但它在不同地点、不同光照条件下与其他动物组一起工作的能力从未经过测试，下一篇论文很少建立在上一篇论文的基础上[29,30]。这个问题在动物行为领域尤为明显，由于物种、环境和行为定义的差异，两项研究并不相同，这使得模型难以可靠地跨数据集进行推广[24,31]。为了解决缺乏上下文泛化的问题，需要基于现场的测试，以及开发特定于物种和上下文的基准，最好是与最终用户（动物）共同设计，以确保可用性[32]。正如最近的一篇综述所报告的那样，缺乏上下文泛化可能导致人工智能工具在福利背景下缺乏系统验证，其中只有5%的猪的精准畜牧技术在市场发布之前在实际农场条件下得到了验证。这种验证差距可能会削弱用户的信心，并减缓人工智能工具在动物福利领域的采用。此外，用于训练算法的数据集的数量也受到基础设施限制[29]。具体来说，处理大型数据集需要大量的计算能力和可靠的存储能力-资源，这经常超过标准研究设施设备[29]的能力。此外，许多人工智能工具需要编码知识或熟悉命令行环境，这限制了具有技术专业知识的人员的可访问性。这排除了很大一部分动物福利从业人员，如兽医、农民和动物园管理员。虽然DeepLabCut、AniPose和DeepEthogram等平台已经开始通过提供更直观的界面来评估动物的运动和行为分类来解决这个问题，但这些工具仍然需要改进以获得更广泛的可用性。此外，与可穿戴技术的使用有关的动物安全问题也被提出，可穿戴技术经常用于动物福利监测[34]。准确地说，设计不良的传感器可能会限制动物的运动，或扰乱社会互动，导致行为改变和可能的伤害[35-37]。同样，许多为奶牛开发的工具在没有充分验证的情况下被重新用于其他牲畜物种，导致福利问题和性能下降。此外，可穿戴传感器标签在实时数据传输方面经常面临限制，特别是在连接可能不可靠的农村或开放空间环境中。这些中断可能导致严重的数据缺口，破坏持续监测工作[37]。此外，电池寿命、传感器尺寸和设备重量之间的权衡带来了额外的设计挑战，经常会影响设备的耐用性或数据收集的频率和分辨率。最后，伦理问题也不容忽视。通过人工智能实现动物福利评估的自动化，引发了人们对其对人类与动物关系影响的合理担忧，而人类与动物关系是动物福利的基石。过度依赖技术可能导致劳动力去技能化，减少对动物需求的关注，并有增加其物化的风险[35,36]。因此，在涉及众生的环境中使用人工智能必须具有透明度、问责制和负责任的创新框架，并考虑更广泛的社会价值bb0。这些关注与“同一福利”原则密切相关，该原则强调决策过程中动物、人类和环境福祉的相互依存关系。总之，与人工智能驱动的工具在动物福利领域相关的挑战和限制不应仅仅被视为技术的缺点，而应被视为改进和创新的机会。这些问题——无论是技术上的、伦理上的还是操作上的——都强调了跨学科合作、透明的数据共享和特定物种的技术验证的必要性[24,25,32]。克服这些挑战对于确保人工智能技术不仅在科学上稳健，而且在道德上与“一个福利”范式的更广泛目标保持一致至关重要。萨拉·柏拉图：概念化，写作-原稿，写作-审查和编辑。作者声明无利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Animal Research and One Health

自引率

0.00%

发文量