2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献_第2页

CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge cnn融合航拍图像中建筑物检测对建筑物检测的挑战

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00044

Rémi Delassus, R. Giot

引用次数: 11

HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images HSCNN+:先进的基于cnn的RGB图像高光谱恢复

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00139

Zhan Shi, C. Chen, Zhiwei Xiong, Dong Liu, Feng Wu

{"title":"HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images","authors":"Zhan Shi, C. Chen, Zhiwei Xiong, Dong Liu, Feng Wu","doi":"10.1109/CVPRW.2018.00139","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00139","url":null,"abstract":"Hyperspectral recovery from a single RGB image has seen a great improvement with the development of deep convolutional neural networks (CNNs). In this paper, we propose two advanced CNNs for the hyperspectral reconstruction task, collectively called HSCNN+. We first develop a deep residual network named HSCNN-R, which comprises a number of residual blocks. The superior performance of this model comes from the modern architecture and optimization by removing the hand-crafted upsampling in HSCNN. Based on the promising results of HSCNN-R, we propose another distinct architecture that replaces the residual block by the dense block with a novel fusion scheme, leading to a new network named HSCNN-D. This model substantially deepens the network structure for a more accurate solution. Experimental results demonstrate that our proposed models significantly advance the state-of-the-art. In the NTIRE 2018 Spectral Reconstruction Challenge, our entries rank the 1st (HSCNN-D) and 2nd (HSCNN-R) places on both the \"Clean\" and \"Real World\" tracks. (Codes are available at [clean-r], [realworld-r], [clean-d], and [realworld-d].)","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128300491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 158

Time Analysis of Pulse-Based Face Anti-Spoofing in Visible and NIR 基于脉冲的人脸可见光和近红外抗欺骗时间分析

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00096

J. Hernandez-Ortega, Julian Fierrez, A. Morales, Pedro Tome

{"title":"Time Analysis of Pulse-Based Face Anti-Spoofing in Visible and NIR","authors":"J. Hernandez-Ortega, Julian Fierrez, A. Morales, Pedro Tome","doi":"10.1109/CVPRW.2018.00096","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00096","url":null,"abstract":"In this paper we study Presentation Attack Detection (PAD) in face recognition systems against realistic artifacts such as 3D masks or good quality of photo attacks. In recent works, pulse detection based on remote photoplethysmography (rPPG) has shown to be a effective countermeasure in concrete setups, but still there is a need for a deeper understanding of when and how this kind of PAD works in various practical conditions. Related works analyze full video sequences (usually over 60 seconds) to distinguish between attacks and legitimate accesses. However, existing approaches may not be as effective as it has been claimed in the literature in time variable scenarios. In this paper we evaluate the performance of an existent state-of-the-art PAD scheme based on rPPG when analyzing short-time video sequences extracted from a longer video. Results are reported using the 3D Mask Attack Database (3DMAD), and a self-collected dataset called Heart Rate Database (HR), including different video durations, spectrum bands, resolutions and frame rates. Several conclusions can be drawn from this work: a) PAD performance based on rPPG varies significantly with the length of the analyzed video, b) rPPG information extracted from short-time sequences (over 5 seconds) can be discriminant enough for performing the PAD task, c) in general, videos using the NIR band perform better than those using the RGB band, and d) the temporal resolution is more valuable for rPPG signal extraction than the spatial resolution.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128375552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

VisDA: A Synthetic-to-Real Benchmark for Visual Domain Adaptation VisDA:视觉域自适应的合成到真实的基准

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00271

Xingchao Peng, Ben Usman, Neela Kaushik, Dequan Wang, Judy Hoffman, Kate Saenko

引用次数: 130

Traffic Flow Analysis with Multiple Adaptive Vehicle Detectors and Velocity Estimation with Landmark-Based Scanlines 基于多自适应车辆检测器的交通流分析和基于地标扫描线的速度估计

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00021

M. Tran, Tung Dinh Duy, Thanh-Dat Truong, V. Ton-That, Thanh-Nhon Do, Quoc-An Luong, Thanh-An Nguyen, Vinh-Tiep Nguyen, M. Do

{"title":"Traffic Flow Analysis with Multiple Adaptive Vehicle Detectors and Velocity Estimation with Landmark-Based Scanlines","authors":"M. Tran, Tung Dinh Duy, Thanh-Dat Truong, V. Ton-That, Thanh-Nhon Do, Quoc-An Luong, Thanh-An Nguyen, Vinh-Tiep Nguyen, M. Do","doi":"10.1109/CVPRW.2018.00021","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00021","url":null,"abstract":"In this paper, we propose our method for vehicle detection with multiple adaptive vehicle detectors and velocity estimation with landmark-based scanlines. Inspired by the idea for tiny object detection, we use Faster R-CNN with Resnet-101 to create different specialized vehicle detectors corresponding to different levels of details and poses. We propose a heuristic to check the fitness of a particular vehicle detector to a specific region in camera's view by the mean velocity direction and the mean object size. By this way, we can determine an adaptive set of appropriate vehicle detectors for each region in camera's view. Thus our system is expected to detect vehicles with high accuracy, both in precision and recall, even with tiny objects. We exploit the U.S. road rules for the length and distance of broken white lines on roads to propose our method for vehicle's velocity estimation using such landmarks. We determine equally-distributed scanlines, virtual parallel lines that are nearly-perpendicular to the road direction, with reference to the line connecting the corresponding ends of multiple broken white lines. From the timespan for a vehicle to cross two consecutive virtual scanlines, we can calculate the average vehicle's velocity within that road segment. We also refine the speed estimation by detecting when a vehicle stops at a traffic light, and smooth the results with a moving average filter. Experiments on the dataset of Traffic Flow Analysis from NVIDIA AI City Challenge 2018 show that our method achieves the perfect detect rate of 100%, the average velocity difference of 6.9762 mph on freeways, and 8.9144 mph on both freeways and urban roads.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121065769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Estimating Attention of Faces Due to its Growing Level of Emotions 由于情绪水平的增长，估计面部的注意力

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00261

R. Kumar, Jogendra Garain, D. Kisku, G. Sanyal

{"title":"Estimating Attention of Faces Due to its Growing Level of Emotions","authors":"R. Kumar, Jogendra Garain, D. Kisku, G. Sanyal","doi":"10.1109/CVPRW.2018.00261","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00261","url":null,"abstract":"In the task of attending faces in the disciplined assembly (Like in examination hall or Silent public places), our gaze automatically goes towards those persons who exhibits their expression other than the normal expression. It happens due to finding of dissimilar expression among the gathering of normal. In order to modeling this concept in the intelligent vision of computer system, hardly some effective researches have been succeeded. Therefore, in this proposal we have tried to come out with a solution for handling such challenging task of computer vision. Actually, this problem is related to cognitive aspect of visual attention. In the literature of visual saliency authors have dealt with expressionless objects but it has not been addressed with object like face which exploits expressions. Visual saliency is a term which differentiates \"appealing\" visual substance from others, based on their feature differences. In this paper, in the set of multiple faces, 'Salient face' has been explored based on 'emotion deviation' from the normal. In the first phase of the experiment, face detection task has been accomplished using Viola Jones face detector. The concept of deep convolution neural network (CNN) has been applied for training and classification of different facial expression of emotions. Moreover, saliency score of every face of the input image have been computed by measuring their 'emotion score' which depends upon the deviation from the 'normal expression' scores. This proposed approach exhibits fairly good result which may give a new dimension to the researchers towards the modeling of an intelligent vision system which can be useful in the task of visual security and surveillance.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121171568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Rotated Rectangles for Symbolized Building Footprint Extraction 旋转矩形的符号化建筑足迹提取

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00039

Matt Dickenson, L. Gueguen

引用次数: 12

Crowd Activity Change Point Detection in Videos via Graph Stream Mining 基于图流挖掘的视频人群活动变化点检测

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00059

Meng Yang, Lida Rashidi, S. Rajasegarar, C. Leckie, A. S. Rao, M. Palaniswami

{"title":"Crowd Activity Change Point Detection in Videos via Graph Stream Mining","authors":"Meng Yang, Lida Rashidi, S. Rajasegarar, C. Leckie, A. S. Rao, M. Palaniswami","doi":"10.1109/CVPRW.2018.00059","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00059","url":null,"abstract":"In recent years, there has been a growing interest in detecting anomalous behavioral patterns in video. In this work, we address this task by proposing a novel activity change point detection method to identify crowd movement anomalies for video surveillance. In our proposed novel framework, a hyperspherical clustering algorithm is utilized for the automatic identification of interesting regions, then the density of pedestrian flows between every pair of interesting regions over consecutive time intervals is monitored and represented as a sequence of adjacency matrices where the direction and density of flows are captured through a directed graph. Finally, we use graph edit distance as well as a cumulative sum test to detect change points in the graph sequence. We conduct experiments on four real-world video datasets: Dublin, New Orleans, Abbey Road and MCG Datasets. We observe that our proposed approach achieves a high F-measure, i.e., in the range [0.7, 1], for these datasets. The evaluation reveals that our proposed method can successfully detect the change points in all datasets at both global and local levels. Our results also demonstrate the efficiency and effectiveness of our proposed algorithm for change point detection and segmentation tasks.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125845824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

AIC2018 Report: Traffic Surveillance Research AIC2018报告:交通监控研究

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00019

Tingyu Mao, Wei Zhang, Haoyu He, Yanjun Lin, Vinay Kale, Alexander Stein, Z. Kostić

引用次数: 12

Improving Viseme Recognition Using GAN-Based Frontal View Mapping 基于gan的正面视图映射改进Viseme识别

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2018-06-01 DOI: 10.1109/CVPRW.2018.00289

Dario Augusto Borges Oliveira, Andréa Britto Mattos, E. Morais

{"title":"Improving Viseme Recognition Using GAN-Based Frontal View Mapping","authors":"Dario Augusto Borges Oliveira, Andréa Britto Mattos, E. Morais","doi":"10.1109/CVPRW.2018.00289","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00289","url":null,"abstract":"Deep learning methods have become the standard for Visual Speech Recognition problems due to their high accuracy results reported in the literature. However, while successful works have been reported for words and sentences, recognizing shorter segments of speech, like phones, has proven to be much more challenging due to the lack of temporal and contextual information. Also, head-pose variation remains a known issue for facial analysis with direct impact in this problem. In this context, we propose a novel methodology to tackle the problem of recognizing visemes – the visual equivalent of phonemes – using a GAN to artificially lock the face view into a perfect frontal view, reducing the view angle variability and simplifying the recognition task performed by our classification CNN. The GAN is trained using a large-scale synthetic 2D dataset based on realistic 3D facial models, automatically labelled for different visemes, mapping a slightly random view to a perfect frontal view. We evaluate our method using the GRID corpus, which was processed to extract viseme images and their corresponding synthetic frontal views to be further classified by our CNN model. Our results demonstrate that the additional synthetic frontal view is able to improve accuracy in 5.9% when compared with classification using the original image only.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115944890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6