{"title":"Digital Sentience? Evaluating the Integration of AI-Driven Tools in Animal Welfare Assessment","authors":"Sara Platto","doi":"10.1002/aro2.70018","DOIUrl":null,"url":null,"abstract":"<p>Despite significant advancements in the field of animal welfare, its assessment still remains a methodological challenge, as an animal's affective state cannot always be directly measured, but must instead be inferred from behavioral, physiological, environmental, and nutritional indicators [<span>1, 2</span>]. This constraint has led to the exploration of artificial intelligence (AI)—driven tools—including machine learning (ML), computer vision, and sensor-based systems—as possible resources to facilitate dynamic, real-time welfare assessments, and predictive analytic [<span>3, 4</span>]. For example, AI-driven wearable sensors facilitate early detection of stress and disease by continuously monitoring vital signs, and behavioral patterns in cattle, pigs, and poultry [<span>5</span>], whereas machine learning can optimize feeding regimes, and identifies health conditions such as lameness [<span>6</span>]. In wildlife conservation, AI-enhanced technologies—including unmanned aerial vehicles (UAVs), thermal imaging, and acoustic monitoring—enable detailed tracking of animal movements, habitat use, and identification of anthropogenic threats such as poaching [<span>7, 8</span>]. AI applications are also emerging within zoological institutions, where neural networks and wearable sensors are employed to gather behavioral, and physiological data of captive animals, supporting their comprehensive welfare assessments [<span>9, 10</span>]. In the field of companion animals' care, AI innovations have advanced diagnostics, cancer screening, and real-time health monitoring through IoT (Internet of Thing)—enabled collars [<span>11, 12</span>]. AI is also making a significant impact in laboratory environments, where it supports the 3 Rs by reducing animals use through predictive toxicology frameworks such as the ONTOX project [<span>13</span>]. Additionally, automated husbandry systems employing AI are considered to be implemented to minimize human–animal interactions, thus reducing stress associated with handling [<span>14</span>].</p><p>Although artificial intelligence (AI) presents promising opportunities to identify how animals perceive and experience their own well-being—its integration into the animal welfare science remains limited [<span>15</span>]. This constraint is largely attributed to persistent practical, conceptual, and technical challenges that limit the widespread application, and translation of AI-based models in real-world animal welfare contexts [<span>16</span>].</p><p>A central technical constraint in AI implementation for animal welfare is the requirement for large, labeled datasets to train the algorithms [<span>17</span>]. Most deep learning models demand substantial volumes of high-quality, labeled data to achieve high accuracy in the performance, particularly for behavioral assessments [<span>18</span>]. Studies estimate that up to 1000 samples per behavioral class may be necessary for an accurate baseline classification, with some models requiring significantly more data depending on network complexity, and task specificity [<span>17</span>]. The reliance on large volume of data by these AI-tools imposes a resource-intensive task on the researchers to collect the data and label them [<span>19</span>]. The requirement of a large amount of data is also supported by studies that showed that when a sample size increases, the effect sizes and classification accuracy also increases, providing a datasets with high discriminatory power between behavioral classes [<span>17</span>]. Therefore, this leads to a conundrum where the pursuit of a large sample size for better accuracy must be balanced against the practicalities, and costs of data collection [<span>20, 21</span>]. To address this challenge, researchers have begun exploring semi-supervised learning approaches, in which large volumes of unlabeled data are combined with smaller, labeled datasets to improve model performance [<span>22</span>]. Although this method shows promise, its effectiveness is often limited by issues such as distribution mismatch, where the characteristics of the labeled and unlabeled data differ significantly [<span>23</span>].</p><p>In addition, properly labeled datasets are highly valuable, since they can enhance the validation process of AI applications in practice. This can lead to issues related to labeled data sharing, especially if a finished product is to be marketed [<span>19</span>]. In fact, the substantial logistical, and ethical barriers to the integration of data across decentralized systems such as farms, laboratories, or universities still constitute a huge concern, specifically when ownership, standardization, privacy, and security are involved [<span>24</span>]. This situation can also cause a deterrent for the development of proper transdisciplinary collaborations, which represents an essential step toward broader adoption of AI in animal welfare [<span>25</span>]. Furthermore, the achievement of a co-development of AI tools among experts from different fields is often challenged by divergent understandings of the core concepts of “<i>animal welfare</i>” [<span>25</span>], where each discipline may bring its own interpretation of the subject: AI engineers may emphasize measurable outputs, whereas veterinarians prioritize health, natural behaviors, and affective states [<span>26, 27</span>]. In order to overcome this obstacles, Fogel and Kvedar [<span>28</span>] suggested to use a three-tier validation framework, often used in medical research collaborations, that includes (1) verification, (2) analytical validation, and (3) application-specific validation, to improve reliability and foster mutual understanding across disciplines.</p><p>Another issue is the lack of contextual generalization in the AI tools [<span>29</span>]. For example, most machine vision research papers use data from one group of animals to train deep learning neural networks to recognize their behaviors, but its capacity to work with other groups of animals, in different locations, under different light conditions is never tested, and the next paper rarely builds on the last one [<span>29, 30</span>]. This issue is particularly pronounced in the field of animal behavior, where two studies are not alike due to differences in species, environments, and behavioral definitions, making it difficult for models to generalize reliably across datasets [<span>24, 31</span>]. To address the lack of contextual generalization will require field-based testing, and the development of species- and context-specific benchmarks, ideally co-designed with end-user (animals) in mind to ensure usability [<span>32</span>]. The lack of contextual generalization can lead to the lack of systematic validation of AI tools in welfare contexts, as reported in a recent review, where only 5% of precision livestock farming technologies for pigs have been validated under practical farm conditions before market release [<span>32</span>]. This validation gap could undermine user confidence, and slows adoption of AI-tools within the animal welfare field.</p><p>Moreover, the amount of data set used to train the algorithms is also limited by infrastructure limitations [<span>29</span>]. Specifically, handling large datasets demands substantial computing power and reliable storage capacity—resources that frequently exceed the capabilities of standard research facilities' equipment [<span>29</span>]. In addition, many AI tools demand coding knowledge or familiarity with command-line environments, limiting accessibility to those with technical expertise [<span>33</span>]. This excludes a large proportion of animal welfare practitioners, such as veterinarians, farmers, and zookeepers [<span>29</span>]. Although platforms such as DeepLabCut, AniPose, and DeepEthogram have begun addressing this issue by offering more intuitive interfaces for the assessment of animals' movements and behaviors' classification, these tools still require refinement for broader usability [<span>34</span>]. Furthermore, concerns related to the safety of the animals is also raised in relation to the use of wearable technologies, frequently used for animal welfare monitoring [<span>34</span>]. Precisely, poorly designed sensors may restrict animals' movement, or disrupt social interactions, leading to altered behaviors and possible injuries [<span>35-37</span>]. Similarly, many tools developed for dairy cattle are repurposed for other livestock species without sufficient validation, resulting in welfare concerns and performance deficits [<span>33</span>]. Additionally, wearable sensor tags often face limitations in real-time data transmission, particularly in rural or open-space environments where connectivity may be unreliable. These interruptions can result in significant data gaps, undermining continuous monitoring efforts [<span>37</span>]. Moreover, trade-offs between battery life, sensor size, and device weight pose additional design challenges, frequently compromising either the durability of the device or the frequency and resolution of data collection [<span>38</span>].</p><p>Finally, ethical implications must not be overlooked [<span>35</span>]. The automation of animal welfare assessment through AI has raised legitimate concerns about its impact on the human–animal relationship—a cornerstone of animal welfare [<span>39</span>]. Over-reliance on technology may lead to workforce deskilling, reduced focus on the animals' needs with the risks to increase its objectification [<span>35, 36</span>]. Therefore, the use of AI in settings involving sentient beings must be approached with transparency, accountability, and a framework of responsible innovation that considers broader societal values [<span>35</span>]. These concerns align closely with the principles of One Welfare, which emphasize the interdependence of animal, human, and environmental well-being in decision-making [<span>40</span>].</p><p>In conclusion, the challenges and limitations associated with AI-driven tools in the field of animal welfare should not be perceived only as shortcomings of the technology, but rather as opportunities for refinement and innovation [<span>16</span>]. These issues—whether technical, ethical, or operational—highlight the need for interdisciplinary collaboration, transparent data-sharing, and context-species specific validations of the technologies [<span>24, 25, 32</span>]. Overcoming these challenges will be essential to ensure that AI technologies are not only scientifically robust, but also ethically aligned with the broader goals of the One Welfare paradigm [<span>41</span>].</p><p><b>Sara Platto:</b> conceptualization, writing – original draft, writing – review and editing.</p><p>The author declares no conflicts of interest.</p>","PeriodicalId":100086,"journal":{"name":"Animal Research and One Health","volume":"3 3","pages":"344-347"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aro2.70018","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Research and One Health","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aro2.70018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Despite significant advancements in the field of animal welfare, its assessment still remains a methodological challenge, as an animal's affective state cannot always be directly measured, but must instead be inferred from behavioral, physiological, environmental, and nutritional indicators [1, 2]. This constraint has led to the exploration of artificial intelligence (AI)—driven tools—including machine learning (ML), computer vision, and sensor-based systems—as possible resources to facilitate dynamic, real-time welfare assessments, and predictive analytic [3, 4]. For example, AI-driven wearable sensors facilitate early detection of stress and disease by continuously monitoring vital signs, and behavioral patterns in cattle, pigs, and poultry [5], whereas machine learning can optimize feeding regimes, and identifies health conditions such as lameness [6]. In wildlife conservation, AI-enhanced technologies—including unmanned aerial vehicles (UAVs), thermal imaging, and acoustic monitoring—enable detailed tracking of animal movements, habitat use, and identification of anthropogenic threats such as poaching [7, 8]. AI applications are also emerging within zoological institutions, where neural networks and wearable sensors are employed to gather behavioral, and physiological data of captive animals, supporting their comprehensive welfare assessments [9, 10]. In the field of companion animals' care, AI innovations have advanced diagnostics, cancer screening, and real-time health monitoring through IoT (Internet of Thing)—enabled collars [11, 12]. AI is also making a significant impact in laboratory environments, where it supports the 3 Rs by reducing animals use through predictive toxicology frameworks such as the ONTOX project [13]. Additionally, automated husbandry systems employing AI are considered to be implemented to minimize human–animal interactions, thus reducing stress associated with handling [14].
Although artificial intelligence (AI) presents promising opportunities to identify how animals perceive and experience their own well-being—its integration into the animal welfare science remains limited [15]. This constraint is largely attributed to persistent practical, conceptual, and technical challenges that limit the widespread application, and translation of AI-based models in real-world animal welfare contexts [16].
A central technical constraint in AI implementation for animal welfare is the requirement for large, labeled datasets to train the algorithms [17]. Most deep learning models demand substantial volumes of high-quality, labeled data to achieve high accuracy in the performance, particularly for behavioral assessments [18]. Studies estimate that up to 1000 samples per behavioral class may be necessary for an accurate baseline classification, with some models requiring significantly more data depending on network complexity, and task specificity [17]. The reliance on large volume of data by these AI-tools imposes a resource-intensive task on the researchers to collect the data and label them [19]. The requirement of a large amount of data is also supported by studies that showed that when a sample size increases, the effect sizes and classification accuracy also increases, providing a datasets with high discriminatory power between behavioral classes [17]. Therefore, this leads to a conundrum where the pursuit of a large sample size for better accuracy must be balanced against the practicalities, and costs of data collection [20, 21]. To address this challenge, researchers have begun exploring semi-supervised learning approaches, in which large volumes of unlabeled data are combined with smaller, labeled datasets to improve model performance [22]. Although this method shows promise, its effectiveness is often limited by issues such as distribution mismatch, where the characteristics of the labeled and unlabeled data differ significantly [23].
In addition, properly labeled datasets are highly valuable, since they can enhance the validation process of AI applications in practice. This can lead to issues related to labeled data sharing, especially if a finished product is to be marketed [19]. In fact, the substantial logistical, and ethical barriers to the integration of data across decentralized systems such as farms, laboratories, or universities still constitute a huge concern, specifically when ownership, standardization, privacy, and security are involved [24]. This situation can also cause a deterrent for the development of proper transdisciplinary collaborations, which represents an essential step toward broader adoption of AI in animal welfare [25]. Furthermore, the achievement of a co-development of AI tools among experts from different fields is often challenged by divergent understandings of the core concepts of “animal welfare” [25], where each discipline may bring its own interpretation of the subject: AI engineers may emphasize measurable outputs, whereas veterinarians prioritize health, natural behaviors, and affective states [26, 27]. In order to overcome this obstacles, Fogel and Kvedar [28] suggested to use a three-tier validation framework, often used in medical research collaborations, that includes (1) verification, (2) analytical validation, and (3) application-specific validation, to improve reliability and foster mutual understanding across disciplines.
Another issue is the lack of contextual generalization in the AI tools [29]. For example, most machine vision research papers use data from one group of animals to train deep learning neural networks to recognize their behaviors, but its capacity to work with other groups of animals, in different locations, under different light conditions is never tested, and the next paper rarely builds on the last one [29, 30]. This issue is particularly pronounced in the field of animal behavior, where two studies are not alike due to differences in species, environments, and behavioral definitions, making it difficult for models to generalize reliably across datasets [24, 31]. To address the lack of contextual generalization will require field-based testing, and the development of species- and context-specific benchmarks, ideally co-designed with end-user (animals) in mind to ensure usability [32]. The lack of contextual generalization can lead to the lack of systematic validation of AI tools in welfare contexts, as reported in a recent review, where only 5% of precision livestock farming technologies for pigs have been validated under practical farm conditions before market release [32]. This validation gap could undermine user confidence, and slows adoption of AI-tools within the animal welfare field.
Moreover, the amount of data set used to train the algorithms is also limited by infrastructure limitations [29]. Specifically, handling large datasets demands substantial computing power and reliable storage capacity—resources that frequently exceed the capabilities of standard research facilities' equipment [29]. In addition, many AI tools demand coding knowledge or familiarity with command-line environments, limiting accessibility to those with technical expertise [33]. This excludes a large proportion of animal welfare practitioners, such as veterinarians, farmers, and zookeepers [29]. Although platforms such as DeepLabCut, AniPose, and DeepEthogram have begun addressing this issue by offering more intuitive interfaces for the assessment of animals' movements and behaviors' classification, these tools still require refinement for broader usability [34]. Furthermore, concerns related to the safety of the animals is also raised in relation to the use of wearable technologies, frequently used for animal welfare monitoring [34]. Precisely, poorly designed sensors may restrict animals' movement, or disrupt social interactions, leading to altered behaviors and possible injuries [35-37]. Similarly, many tools developed for dairy cattle are repurposed for other livestock species without sufficient validation, resulting in welfare concerns and performance deficits [33]. Additionally, wearable sensor tags often face limitations in real-time data transmission, particularly in rural or open-space environments where connectivity may be unreliable. These interruptions can result in significant data gaps, undermining continuous monitoring efforts [37]. Moreover, trade-offs between battery life, sensor size, and device weight pose additional design challenges, frequently compromising either the durability of the device or the frequency and resolution of data collection [38].
Finally, ethical implications must not be overlooked [35]. The automation of animal welfare assessment through AI has raised legitimate concerns about its impact on the human–animal relationship—a cornerstone of animal welfare [39]. Over-reliance on technology may lead to workforce deskilling, reduced focus on the animals' needs with the risks to increase its objectification [35, 36]. Therefore, the use of AI in settings involving sentient beings must be approached with transparency, accountability, and a framework of responsible innovation that considers broader societal values [35]. These concerns align closely with the principles of One Welfare, which emphasize the interdependence of animal, human, and environmental well-being in decision-making [40].
In conclusion, the challenges and limitations associated with AI-driven tools in the field of animal welfare should not be perceived only as shortcomings of the technology, but rather as opportunities for refinement and innovation [16]. These issues—whether technical, ethical, or operational—highlight the need for interdisciplinary collaboration, transparent data-sharing, and context-species specific validations of the technologies [24, 25, 32]. Overcoming these challenges will be essential to ensure that AI technologies are not only scientifically robust, but also ethically aligned with the broader goals of the One Welfare paradigm [41].
Sara Platto: conceptualization, writing – original draft, writing – review and editing.