C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich
{"title":"使用卷积神经网络和消费级无人机进行动物检测的视频分析","authors":"C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich","doi":"10.1139/JUVS-2020-0018","DOIUrl":null,"url":null,"abstract":"Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.","PeriodicalId":45619,"journal":{"name":"Journal of Unmanned Vehicle Systems","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2021-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones\",\"authors\":\"C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich\",\"doi\":\"10.1139/JUVS-2020-0018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.\",\"PeriodicalId\":45619,\"journal\":{\"name\":\"Journal of Unmanned Vehicle Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2021-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Unmanned Vehicle Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1139/JUVS-2020-0018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Unmanned Vehicle Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1139/JUVS-2020-0018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"REMOTE SENSING","Score":null,"Total":0}
Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones
Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.