2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)最新文献

筛选
英文 中文
Fusion Tree Network for RGBT Tracking 基于融合树网络的rbt跟踪
Zhiyuan Cheng, Andong Lu, Zhang Zhang, Chenglong Li, Liang Wang
{"title":"Fusion Tree Network for RGBT Tracking","authors":"Zhiyuan Cheng, Andong Lu, Zhang Zhang, Chenglong Li, Liang Wang","doi":"10.1109/AVSS56176.2022.9959406","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959406","url":null,"abstract":"RGBT tracking is often affected by complex scenes (i.e., occlusions, scale changes, noisy background, etc). Existing works usually adopt a single-strategy RGBT tracking fusion scheme to handle modality fusion in all scenarios. However, due to the limitation of fusion model capacity, it is difficult to fully integrate the discriminative features between different modalities. To tackle this problem, we propose a Fusion Tree Network (FTNet), which provides a multi-strategy fusion model with high capacity to efficiently fuse different modalities. Specifically, we combine three kinds of attention modules (i.e., channel attention, spatial attention, and location attention) in a tree structure to achieve multi-path hybrid attention in the deeper convolutional stages of the object tracking network. Extensive experiments are performed on three RGBT tracking datasets, and the results show that our method achieves superior performance among state-of-the-art RGBT tracking models.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116937914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
UPM-GTI-Face: A dataset for the evaluation of the impact of distance and masks in face detection and recognition systems UPM-GTI-Face:用于评估距离和掩模在人脸检测和识别系统中的影响的数据集
Marcos Rodrigo, E. González-Sosa, Carlos Cuevas, N. García
{"title":"UPM-GTI-Face: A dataset for the evaluation of the impact of distance and masks in face detection and recognition systems","authors":"Marcos Rodrigo, E. González-Sosa, Carlos Cuevas, N. García","doi":"10.1109/AVSS56176.2022.9959558","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959558","url":null,"abstract":"We present a novel dataset for the evaluation of face detection and recognition algorithms in challenging surveillance scenarios. The dataset consists in 4K images of different subjects captured at annotated distances ranging from 1 to 30 meters, both in indoor and outdoor environments, and under two face mask conditions (with and without). To the best of our knowledge, this is the only existing dataset that addresses the joint impact of masks and distances in a rigorous manner. We also propose an end-to-end fully automatic face detection and recognition system to provide baseline results on this dataset. Face detection is performed using Tiny Faces network, while face recognition is performed using VGG Face network. Experimental results show very high detection and recognition rates up to a distance of 20 meters, where the impact of distance is clear (especially for the latter). The use of face masks degrades the detection range and produces less consistent recognition results.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127165370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Occlusion-Aware Dense Correspondences for Multi-Modal Images 学习多模态图像的闭塞感知密集对应
Ryosuke Shimoya, Takashi Morimoto, J. van Baar, P. Boufounos, Yanting Ma, Hassan Mansour
{"title":"Learning Occlusion-Aware Dense Correspondences for Multi-Modal Images","authors":"Ryosuke Shimoya, Takashi Morimoto, J. van Baar, P. Boufounos, Yanting Ma, Hassan Mansour","doi":"10.1109/AVSS56176.2022.9959354","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959354","url":null,"abstract":"We introduce a scalable multi-modal approach to learn dense, i.e., pixel-level, correspondences and occlusion maps, between images in a video sequence. The problems of finding dense correspondences and occlusion maps are fundamental in computer vision. In this work we jointly train a deep network to tackle both, with a shared feature extraction stage. We use depth and color images with ground truth optical flow and occlusion maps to train the network end-to-end. From the multi-modal input, the network learns to estimate occlusion maps, optical flows, and a correspondence embedding providing a meaningful latent feature space. We evaluate the performance on a dataset of images derived from synthetic characters, and perform a thorough ablation study to demonstrate that the proposed components of our architecture combine to achieve the lowest correspondence error. The scalability of our proposed method comes from the ability to incorporate additional modalities, e.g., infrared images.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123237647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Background Subtraction Network Module Ensemble for Background Scene Adaptation 背景场景自适应的背景减法网络模块集成
Taiki Hamada, T. Minematsu, Atsushi Shimada, Fumiya Okubo, Yuta Taniguchi
{"title":"Background Subtraction Network Module Ensemble for Background Scene Adaptation","authors":"Taiki Hamada, T. Minematsu, Atsushi Shimada, Fumiya Okubo, Yuta Taniguchi","doi":"10.1109/AVSS56176.2022.9959316","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959316","url":null,"abstract":"Background subtraction networks outperform traditional hand-craft background subtraction methods. The main advantage of background subtraction networks is their ability to automatically learn background features for training scenes. When applying the trained network to new target scenes, adapting the network to the new scenes is crucial. However, few studies have focused on reusing multiple trained models for new target scenes. Considering background changes have several categories, such as illumination changes, a model trained for each background scene can work effectively for the target scene similar to the training scene. In this study, we propose a method to ensemble the module networks trained for each background scene. Experimental results show that the proposed method is significantly more accurate compared with the conventional methods in the target scene by tuning with only a few frames.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117259698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual Camera Based High Spatio-Temporal Resolution Video Generation For Wide Area Surveillance 基于双摄像头的广域监控高时空分辨率视频生成
H. U. Suluhan, H. Ateş, B. Gunturk
{"title":"Dual Camera Based High Spatio-Temporal Resolution Video Generation For Wide Area Surveillance","authors":"H. U. Suluhan, H. Ateş, B. Gunturk","doi":"10.1109/AVSS56176.2022.9959711","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959711","url":null,"abstract":"Wide area surveillance (WAS) requires high spatiotemporal resolution (HSTR) video for better precision. As an alternative to expensive WAS systems, low-cost hybrid imaging systems can be used. This paper presents the usage of multiple video feeds for the generation of HSTR video as an extension of reference based super resolution (RefSR). One feed captures video at high spatial resolution with low frame rate (HSLF) while the other captures low spatial resolution and high frame rate (LSHF) video simultaneously for the same scene. The main purpose is to create an HSTR video from the fusion of HSLF and LSHF videos. In this paper we propose an end-to-end trainable deep network that performs optical flow (OF) estimation and frame reconstruction by combining inputs from both video feeds. The proposed architecture provides significant improvement over existing video frame interpolation and RefSR techniques in terms of PSNR and SSIM metrics and can be deployed on drones with dual cameras.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"26 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132870904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerated Blind Deblurring Method via Video-based Estimation in Next Point Spread Functions for Surveillance 基于视频估计的下一点扩展函数监控加速盲去模糊方法
A. Güven, Ceren Özçelik, D. M. Sazak
{"title":"Accelerated Blind Deblurring Method via Video-based Estimation in Next Point Spread Functions for Surveillance","authors":"A. Güven, Ceren Özçelik, D. M. Sazak","doi":"10.1109/AVSS56176.2022.9959473","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959473","url":null,"abstract":"Blind deblurring has been attracting increased attention. In real-life problems, high-resolution images are needed to process and the blurring function, point spread function (PSF), is mostly unknown, especially in the surveillance systems such as camera integrated payload drop with a parachute. The PSFs are dependent on their previous functions, so we perform the deblurring process faster with our proposed model by integrating a previously prepared deep learning method. Our system consists of four phases: (i) enhancing images with an existing deep learning method, (ii) obtaining PSFs, (iii) predicting the next PSFs with our model, and (iv) enhancing the images with the wienerfiltering we developed. The number of PSFs to be estimated was experimentally found as the point at which the PSNR value began to decrease in the test images. Convolutional LSTM layers were used for our model which has been compared with other state-of-the-art models in terms of performance and running time.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117136654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPDM: Fisheye Panoptic segmentation dataset for Door Monitoring FPDM:用于门监控的鱼眼全视分割数据集
Mohamed Thioune, Sanaa Chafik, Ankur Mahtani, Olivier Laurendin, Safia Boudra
{"title":"FPDM: Fisheye Panoptic segmentation dataset for Door Monitoring","authors":"Mohamed Thioune, Sanaa Chafik, Ankur Mahtani, Olivier Laurendin, Safia Boudra","doi":"10.1109/AVSS56176.2022.9959151","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959151","url":null,"abstract":"Most existing panoptic segmentation datasets are not suited for applications in the railway environment. This paper introduces a new dataset composed of video feeds taken in the vicinity of train doors. It is aimed at the training of deep learning algorithms to identify the obstacles between doors to ensure passenger safety during boarding and to reduce boarding time. The dataset is acquired from fisheye cameras located at the train doors. The data is annotated entirely manually. The Fisheye Panoptic Door Monitoring dataset (FPDM) contains 3952 images with their annotation masks featuring 18 of the most frequent instance categories in the vicinity of train doors. FPDM answers the panoptic segmentation challenge by offering a new challenging dataset for the computer vision community. We present detailed information on the process of acquisition, annotation, and division of the data into training and validation sets in addition with an evaluation of an existing deep learning method.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126946653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automated Single Particle Growth Measurement using Segmentation 使用分割的自动单粒子生长测量
M. Rafique, Muhamamd Ishfaq Hussain, M. Hassan, W. Jung, Bong-Joong Kim, M. Jeon
{"title":"Automated Single Particle Growth Measurement using Segmentation","authors":"M. Rafique, Muhamamd Ishfaq Hussain, M. Hassan, W. Jung, Bong-Joong Kim, M. Jeon","doi":"10.1109/AVSS56176.2022.9959296","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959296","url":null,"abstract":"Fine-grain imaging is revealing secrets of nature with every passing day and artificial intelligence is reducing the manual effort required for detailed analysis. This work proposes an automated growth measurement of a particle in electron microscopic images in real-time. The particle selected in this study is an Au spiky nanoparticle (SNP) that develops spikes over the course of its growth. In this study, multiple techniques from conventional and sophisticated algorithms are used to segment the particle using supervised and unsupervised learning techniques. A comprehensive analysis of the automated techniques is presented with qualitative and quantitative results.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125525262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Robust Unseen Video Understanding for Various Surveillance Environments 鲁棒的未知视频理解各种监控环境
Prashant W. Patil, Jasdeep Singh, Praful Hambarde, Ashutosh Kulkarni, S. Chaudhary, S. Murala
{"title":"Robust Unseen Video Understanding for Various Surveillance Environments","authors":"Prashant W. Patil, Jasdeep Singh, Praful Hambarde, Ashutosh Kulkarni, S. Chaudhary, S. Murala","doi":"10.1109/AVSS56176.2022.9959513","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959513","url":null,"abstract":"Automated video-based applications are a highly demanding technique from a security perspective, where detection of moving objects i.e., moving object segmentation (MOS) is performed. Therefore, we have proposed an effective solution with a spatio-temporal squeeze excitation mechanism (SqEm) based multi-level feature sharing encoder-decoder network for MOS. Here, the SqEm module is proposed to get prominent foreground edge information using spatio-temporal features. Further, a multi-level feature sharing residual decoder module is proposed with respective SqEm features and previous output features for accurate and consistent foreground segmentation. To handle the foreground or background class imbalance issue, we propose a region of interest-based edge loss. The extensive experimental analysis on three databases is conducted. Result analysis and ablation study proved the robustness of the proposed network for unseen video understanding over SOTA methods.","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126826494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deformable Modules for Flexible Feature Sampling on Vision Transformer 视觉变压器柔性特征采样的可变形模块
Chanjong Park, Dongha Bahn, Jae-il Jung
{"title":"Deformable Modules for Flexible Feature Sampling on Vision Transformer","authors":"Chanjong Park, Dongha Bahn, Jae-il Jung","doi":"10.1109/AVSS56176.2022.9959253","DOIUrl":"https://doi.org/10.1109/AVSS56176.2022.9959253","url":null,"abstract":"Vision transformers have shown that the self-attention mechanism performs well in the computer vision field. However, since such transformers are based on data sampled from fixed areas, there is a limit to efficiently learning the important features in images. To compensate, we propose two modules based on the deformable operation: deformable patch embedding and deformable pooling. Deformable patch embedding consists of a hybrid structure of standard and deformable convolutions, and adaptively samples features from an image. The deformable pooling module also has a similar structure to the embedding module, but it not only samples data flexibly after self-attention but also allows the transformer to learn spatial information of various scales. The experimental results show that the transformer with the proposed modules converges faster and outperforms various vision transformers on image classification (ImageNet-1K) and object detection (MS-COCO).","PeriodicalId":408581,"journal":{"name":"2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116605704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信