José Ramón Gómez-Armenta , Humberto Pérez-Espinosa , José Alberto Fernández-Zepeda , Verónica Reyes-Meza
{"title":"利用深度学习对狗叫声进行自动分类","authors":"José Ramón Gómez-Armenta , Humberto Pérez-Espinosa , José Alberto Fernández-Zepeda , Verónica Reyes-Meza","doi":"10.1016/j.beproc.2024.105028","DOIUrl":null,"url":null,"abstract":"<div><p>Barking and other dog vocalizations have acoustic properties related to emotions, physiological reactions, attitudes, or some particular internal states. In the field of intelligent audio analysis, researchers use methods based on signal processing and machine learning to analyze the digitized acoustic signals’ properties and obtain relevant information. The present work describes a method to classify the identity, breed, age, sex, and context associated with each bark. This information can support the decisions of people who regularly interact with animals, such as dog trainers, veterinarians, rescuers, police, people with visual impairment. Our approach uses deep neural networks to generate trained models for each classification task. We worked with 19,643 barks recorded from 113 dogs of different breeds, ages and sexes. Our methodology consists of three stages. First, the pre-processing stage prepares the data and transforms it into the appropriate format for each classification model. Second, the characterization stage evaluates different representation models to identify the most suitable for each task. Third, the classification stage trains each classification model and selects the best hyperparameters. After tuning and training each model, we evaluated its performance. We analyzed the most relevant features extracted from the audio and the most appropriate deep neural network architecture for that feature type. Even if the application of our method is not ready for being used in ethological practice, our evaluation showed an outstanding performance of the proposed method, surpassing previous research results on this topic, providing the basis for further technological development.</p></div>","PeriodicalId":8746,"journal":{"name":"Behavioural Processes","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic classification of dog barking using deep learning\",\"authors\":\"José Ramón Gómez-Armenta , Humberto Pérez-Espinosa , José Alberto Fernández-Zepeda , Verónica Reyes-Meza\",\"doi\":\"10.1016/j.beproc.2024.105028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Barking and other dog vocalizations have acoustic properties related to emotions, physiological reactions, attitudes, or some particular internal states. In the field of intelligent audio analysis, researchers use methods based on signal processing and machine learning to analyze the digitized acoustic signals’ properties and obtain relevant information. The present work describes a method to classify the identity, breed, age, sex, and context associated with each bark. This information can support the decisions of people who regularly interact with animals, such as dog trainers, veterinarians, rescuers, police, people with visual impairment. Our approach uses deep neural networks to generate trained models for each classification task. We worked with 19,643 barks recorded from 113 dogs of different breeds, ages and sexes. Our methodology consists of three stages. First, the pre-processing stage prepares the data and transforms it into the appropriate format for each classification model. Second, the characterization stage evaluates different representation models to identify the most suitable for each task. Third, the classification stage trains each classification model and selects the best hyperparameters. After tuning and training each model, we evaluated its performance. We analyzed the most relevant features extracted from the audio and the most appropriate deep neural network architecture for that feature type. Even if the application of our method is not ready for being used in ethological practice, our evaluation showed an outstanding performance of the proposed method, surpassing previous research results on this topic, providing the basis for further technological development.</p></div>\",\"PeriodicalId\":8746,\"journal\":{\"name\":\"Behavioural Processes\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2024-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Behavioural Processes\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0376635724000433\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BEHAVIORAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavioural Processes","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0376635724000433","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BEHAVIORAL SCIENCES","Score":null,"Total":0}
Automatic classification of dog barking using deep learning
Barking and other dog vocalizations have acoustic properties related to emotions, physiological reactions, attitudes, or some particular internal states. In the field of intelligent audio analysis, researchers use methods based on signal processing and machine learning to analyze the digitized acoustic signals’ properties and obtain relevant information. The present work describes a method to classify the identity, breed, age, sex, and context associated with each bark. This information can support the decisions of people who regularly interact with animals, such as dog trainers, veterinarians, rescuers, police, people with visual impairment. Our approach uses deep neural networks to generate trained models for each classification task. We worked with 19,643 barks recorded from 113 dogs of different breeds, ages and sexes. Our methodology consists of three stages. First, the pre-processing stage prepares the data and transforms it into the appropriate format for each classification model. Second, the characterization stage evaluates different representation models to identify the most suitable for each task. Third, the classification stage trains each classification model and selects the best hyperparameters. After tuning and training each model, we evaluated its performance. We analyzed the most relevant features extracted from the audio and the most appropriate deep neural network architecture for that feature type. Even if the application of our method is not ready for being used in ethological practice, our evaluation showed an outstanding performance of the proposed method, surpassing previous research results on this topic, providing the basis for further technological development.
期刊介绍:
Behavioural Processes is dedicated to the publication of high-quality original research on animal behaviour from any theoretical perspective. It welcomes contributions that consider animal behaviour from behavioural analytic, cognitive, ethological, ecological and evolutionary points of view. This list is not intended to be exhaustive, and papers that integrate theory and methodology across disciplines are particularly welcome.