Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones

IF 1.3 Q3 REMOTE SENSING

Journal of Unmanned Vehicle Systems Pub Date : 2021-04-15 DOI:10.1139/JUVS-2020-0018

C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich

{"title":"Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones","authors":"C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich","doi":"10.1139/JUVS-2020-0018","DOIUrl":null,"url":null,"abstract":"Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.","PeriodicalId":45619,"journal":{"name":"Journal of Unmanned Vehicle Systems","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2021-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Unmanned Vehicle Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1139/JUVS-2020-0018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"REMOTE SENSING","Score":null,"Total":0}

引用次数: 6

Abstract

Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.

查看原文本刊更多论文

使用卷积神经网络和消费级无人机进行动物检测的视频分析

确定动物的分布和密度在保护中很重要。这个过程既费时又费力。通过在更短的时间内覆盖更大的地理区域，无人机已被用于帮助减轻人力密集型任务。在本文中，我们使用概念验证来进一步研究这一想法，以从无人机镜头中检测犀牛和汽车。概念验证利用了现成的技术和消费级无人机硬件。该研究证明了使用机器学习(ML)自动化日常保护任务(如动物检测和跟踪)的可行性。原型机已经使用大疆Mavic Pro 2开发，并在全球移动通信(GSM)网络系统上进行了测试。Faster-RCNN Resnet 101架构用于迁移学习。推理是用帧采样技术执行的，以解决精度、处理速度和实时视频馈送同步之间所需的权衡。推理模型托管在web平台上，来自无人机的视频流(使用OcuSync)被传输到实时消息传递协议(RTMP)服务器以进行后续分类。在训练过程中，最佳模型的平均精度(mAP)分别为0.83交集/联合(@IOU) 0.50和0.69 @IOU 0.75。在Knowsley Safari中测试系统，我们的原型能够实现以下目标:灵敏度(Sen)， 0.91 (0.869, 0.94);特异性(Spec)， 0.78 (0.74, 0.82);检测犀牛的准确率(ACC)为0.84 (0.81,0.87)，Sen为1.00 (1.00,1.00);规格，1.00 (1.00,1.00);检测车辆时，ACC为1.00(1.00,1.00)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Unmanned Vehicle Systems REMOTE SENSING-

CiteScore

5.30

自引率

0.00%

发文量