基于运动上下文和特征聚合的视频目标检测

Jaekyum Kim, Junho Koh, J. Choi
{"title":"基于运动上下文和特征聚合的视频目标检测","authors":"Jaekyum Kim, Junho Koh, J. Choi","doi":"10.1109/ICTC49870.2020.9289386","DOIUrl":null,"url":null,"abstract":"The deep learning technique has recently led to significant improvement in object-detection accuracy. Numerous object detection schemes have been designed to process each frame independently. However, in many applications, object detection is performed using video data, which consists of a sequence of image frames. Thus, the object detection accuracy can be improved by exploiting the temporal context of the video sequence. In this paper, we propose a novel video object detection method that exploits both the motion context of the object and spatio-temporal aggregated features to enhance the video object detection performance. First, the motion context of the object is extracted by the correlation operator between the feature maps of two adjacent frames. In addition to generating the motion context, the spatial feature maps for N adjacent frames are aggregated to boost the quality of the feature map with gated attention network.","PeriodicalId":282243,"journal":{"name":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video Object Detection Using Motion Context and Feature Aggregation\",\"authors\":\"Jaekyum Kim, Junho Koh, J. Choi\",\"doi\":\"10.1109/ICTC49870.2020.9289386\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deep learning technique has recently led to significant improvement in object-detection accuracy. Numerous object detection schemes have been designed to process each frame independently. However, in many applications, object detection is performed using video data, which consists of a sequence of image frames. Thus, the object detection accuracy can be improved by exploiting the temporal context of the video sequence. In this paper, we propose a novel video object detection method that exploits both the motion context of the object and spatio-temporal aggregated features to enhance the video object detection performance. First, the motion context of the object is extracted by the correlation operator between the feature maps of two adjacent frames. In addition to generating the motion context, the spatial feature maps for N adjacent frames are aggregated to boost the quality of the feature map with gated attention network.\",\"PeriodicalId\":282243,\"journal\":{\"name\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"2012 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC49870.2020.9289386\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC49870.2020.9289386","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

最近,深度学习技术显著提高了目标检测的准确性。许多目标检测方案被设计用来独立处理每一帧。然而,在许多应用中,目标检测是使用视频数据执行的,视频数据由一系列图像帧组成。因此,可以通过利用视频序列的时间上下文来提高目标检测精度。本文提出了一种新的视频目标检测方法,该方法利用目标的运动背景和时空聚合特征来提高视频目标的检测性能。首先,利用相邻两帧特征映射之间的相关算子提取目标的运动上下文;除了生成运动上下文外,还聚合了N个相邻帧的空间特征映射,通过门控注意网络提高了特征映射的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Video Object Detection Using Motion Context and Feature Aggregation
The deep learning technique has recently led to significant improvement in object-detection accuracy. Numerous object detection schemes have been designed to process each frame independently. However, in many applications, object detection is performed using video data, which consists of a sequence of image frames. Thus, the object detection accuracy can be improved by exploiting the temporal context of the video sequence. In this paper, we propose a novel video object detection method that exploits both the motion context of the object and spatio-temporal aggregated features to enhance the video object detection performance. First, the motion context of the object is extracted by the correlation operator between the feature maps of two adjacent frames. In addition to generating the motion context, the spatial feature maps for N adjacent frames are aggregated to boost the quality of the feature map with gated attention network.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信