Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones

IF 1.3 Q3 REMOTE SENSING
C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich
{"title":"Video analysis for the detection of animals using convolutional neural networks and consumer-grade drones","authors":"C. Chalmers, P. Fergus, C. C. Montañez, S. Longmore, S. Wich","doi":"10.1139/JUVS-2020-0018","DOIUrl":null,"url":null,"abstract":"Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.","PeriodicalId":45619,"journal":{"name":"Journal of Unmanned Vehicle Systems","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2021-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Unmanned Vehicle Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1139/JUVS-2020-0018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 6

Abstract

Determining animal distribution and density is important in conservation. The process is both time-consuming and labour-intensive. Drones have been used to help mitigate human-intensive tasks by covering large geographical areas over a much shorter timescale. In this paper we investigate this idea further using a proof of concept to detect rhinos and cars from drone footage. The proof of concept utilises off-the-shelf technology and consumer-grade drone hardware. The study demonstrates the feasibility of using machine learning (ML) to automate routine conservation tasks, such as animal detection and tracking. The prototype has been developed using a DJI Mavic Pro 2 and tested over a global system for mobile communications (GSM) network. The Faster-RCNN Resnet 101 architecture is used for transfer learning. Inference is performed with a frame sampling technique to address the required trade-off between precision, processing speed, and live video feed synchronisation. Inference models are hosted on a web platform and video streams from the drone (using OcuSync) are transmitted to a real-time messaging protocol (RTMP) server for subsequent classification. During training, the best model achieves a mean average precision (mAP) of 0.83 intersection over union (@IOU) 0.50 and 0.69 @IOU 0.75, respectively. On testing the system in Knowsley Safari our prototype was able to achieve the following: sensitivity (Sen), 0.91 (0.869, 0.94); specificity (Spec), 0.78 (0.74, 0.82); and an accuracy (ACC), 0.84 (0.81, 0.87) when detecting rhinos, and Sen, 1.00 (1.00, 1.00); Spec, 1.00 (1.00, 1.00); and an ACC, 1.00 (1.00, 1.00) when detecting cars.
使用卷积神经网络和消费级无人机进行动物检测的视频分析
确定动物的分布和密度在保护中很重要。这个过程既费时又费力。通过在更短的时间内覆盖更大的地理区域,无人机已被用于帮助减轻人力密集型任务。在本文中,我们使用概念验证来进一步研究这一想法,以从无人机镜头中检测犀牛和汽车。概念验证利用了现成的技术和消费级无人机硬件。该研究证明了使用机器学习(ML)自动化日常保护任务(如动物检测和跟踪)的可行性。原型机已经使用大疆Mavic Pro 2开发,并在全球移动通信(GSM)网络系统上进行了测试。Faster-RCNN Resnet 101架构用于迁移学习。推理是用帧采样技术执行的,以解决精度、处理速度和实时视频馈送同步之间所需的权衡。推理模型托管在web平台上,来自无人机的视频流(使用OcuSync)被传输到实时消息传递协议(RTMP)服务器以进行后续分类。在训练过程中,最佳模型的平均精度(mAP)分别为0.83交集/联合(@IOU) 0.50和0.69 @IOU 0.75。在Knowsley Safari中测试系统,我们的原型能够实现以下目标:灵敏度(Sen), 0.91 (0.869, 0.94);特异性(Spec), 0.78 (0.74, 0.82);检测犀牛的准确率(ACC)为0.84 (0.81,0.87),Sen为1.00 (1.00,1.00);规格,1.00 (1.00,1.00);检测车辆时,ACC为1.00(1.00,1.00)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
2
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信