2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

筛选
英文 中文
CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge cnn融合航拍图像中建筑物检测对建筑物检测的挑战
Rémi Delassus, R. Giot
{"title":"CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge","authors":"Rémi Delassus, R. Giot","doi":"10.1109/CVPRW.2018.00044","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00044","url":null,"abstract":"This paper presents our contribution to the DeepGlobe Building Detection Challenge. We enhanced the SpaceNet Challenge winning solution by proposing a new fusion strategy based on a deep combiner using segmentation both results of different CNN and input data to segment. Segmentation results for all cities have been significantly improved (between 1% improvement over the baseline for the smallest one to more than 7% for the biggest one). The separation of adjacent buildings should be the next enhancement made to the solution.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129544891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images HSCNN+:先进的基于cnn的RGB图像高光谱恢复
Zhan Shi, C. Chen, Zhiwei Xiong, Dong Liu, Feng Wu
{"title":"HSCNN+: Advanced CNN-Based Hyperspectral Recovery from RGB Images","authors":"Zhan Shi, C. Chen, Zhiwei Xiong, Dong Liu, Feng Wu","doi":"10.1109/CVPRW.2018.00139","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00139","url":null,"abstract":"Hyperspectral recovery from a single RGB image has seen a great improvement with the development of deep convolutional neural networks (CNNs). In this paper, we propose two advanced CNNs for the hyperspectral reconstruction task, collectively called HSCNN+. We first develop a deep residual network named HSCNN-R, which comprises a number of residual blocks. The superior performance of this model comes from the modern architecture and optimization by removing the hand-crafted upsampling in HSCNN. Based on the promising results of HSCNN-R, we propose another distinct architecture that replaces the residual block by the dense block with a novel fusion scheme, leading to a new network named HSCNN-D. This model substantially deepens the network structure for a more accurate solution. Experimental results demonstrate that our proposed models significantly advance the state-of-the-art. In the NTIRE 2018 Spectral Reconstruction Challenge, our entries rank the 1st (HSCNN-D) and 2nd (HSCNN-R) places on both the \"Clean\" and \"Real World\" tracks. (Codes are available at [clean-r], [realworld-r], [clean-d], and [realworld-d].)","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128300491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 158
Time Analysis of Pulse-Based Face Anti-Spoofing in Visible and NIR 基于脉冲的人脸可见光和近红外抗欺骗时间分析
J. Hernandez-Ortega, Julian Fierrez, A. Morales, Pedro Tome
{"title":"Time Analysis of Pulse-Based Face Anti-Spoofing in Visible and NIR","authors":"J. Hernandez-Ortega, Julian Fierrez, A. Morales, Pedro Tome","doi":"10.1109/CVPRW.2018.00096","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00096","url":null,"abstract":"In this paper we study Presentation Attack Detection (PAD) in face recognition systems against realistic artifacts such as 3D masks or good quality of photo attacks. In recent works, pulse detection based on remote photoplethysmography (rPPG) has shown to be a effective countermeasure in concrete setups, but still there is a need for a deeper understanding of when and how this kind of PAD works in various practical conditions. Related works analyze full video sequences (usually over 60 seconds) to distinguish between attacks and legitimate accesses. However, existing approaches may not be as effective as it has been claimed in the literature in time variable scenarios. In this paper we evaluate the performance of an existent state-of-the-art PAD scheme based on rPPG when analyzing short-time video sequences extracted from a longer video. Results are reported using the 3D Mask Attack Database (3DMAD), and a self-collected dataset called Heart Rate Database (HR), including different video durations, spectrum bands, resolutions and frame rates. Several conclusions can be drawn from this work: a) PAD performance based on rPPG varies significantly with the length of the analyzed video, b) rPPG information extracted from short-time sequences (over 5 seconds) can be discriminant enough for performing the PAD task, c) in general, videos using the NIR band perform better than those using the RGB band, and d) the temporal resolution is more valuable for rPPG signal extraction than the spatial resolution.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128375552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
VisDA: A Synthetic-to-Real Benchmark for Visual Domain Adaptation VisDA:视觉域自适应的合成到真实的基准
Xingchao Peng, Ben Usman, Neela Kaushik, Dequan Wang, Judy Hoffman, Kate Saenko
{"title":"VisDA: A Synthetic-to-Real Benchmark for Visual Domain Adaptation","authors":"Xingchao Peng, Ben Usman, Neela Kaushik, Dequan Wang, Judy Hoffman, Kate Saenko","doi":"10.1109/CVPRW.2018.00271","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00271","url":null,"abstract":"The success of machine learning methods on visual recognition tasks is highly dependent on access to large labeled datasets. However, real training images are expensive to collect and annotate for both computer vision and robotic applications. The synthetic images are easy to generate but model performance often drops significantly on data from a new deployment domain, a problem known as dataset shift, or dataset bias. Changes in the visual domain can include lighting, camera pose and background variation, as well as general changes in how the image data is collected. While this problem has been studied extensively in the domain adaptation literature, progress has been limited by the lack of large-scale challenge benchmarks.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128633267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 130
Traffic Flow Analysis with Multiple Adaptive Vehicle Detectors and Velocity Estimation with Landmark-Based Scanlines 基于多自适应车辆检测器的交通流分析和基于地标扫描线的速度估计
M. Tran, Tung Dinh Duy, Thanh-Dat Truong, V. Ton-That, Thanh-Nhon Do, Quoc-An Luong, Thanh-An Nguyen, Vinh-Tiep Nguyen, M. Do
{"title":"Traffic Flow Analysis with Multiple Adaptive Vehicle Detectors and Velocity Estimation with Landmark-Based Scanlines","authors":"M. Tran, Tung Dinh Duy, Thanh-Dat Truong, V. Ton-That, Thanh-Nhon Do, Quoc-An Luong, Thanh-An Nguyen, Vinh-Tiep Nguyen, M. Do","doi":"10.1109/CVPRW.2018.00021","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00021","url":null,"abstract":"In this paper, we propose our method for vehicle detection with multiple adaptive vehicle detectors and velocity estimation with landmark-based scanlines. Inspired by the idea for tiny object detection, we use Faster R-CNN with Resnet-101 to create different specialized vehicle detectors corresponding to different levels of details and poses. We propose a heuristic to check the fitness of a particular vehicle detector to a specific region in camera's view by the mean velocity direction and the mean object size. By this way, we can determine an adaptive set of appropriate vehicle detectors for each region in camera's view. Thus our system is expected to detect vehicles with high accuracy, both in precision and recall, even with tiny objects. We exploit the U.S. road rules for the length and distance of broken white lines on roads to propose our method for vehicle's velocity estimation using such landmarks. We determine equally-distributed scanlines, virtual parallel lines that are nearly-perpendicular to the road direction, with reference to the line connecting the corresponding ends of multiple broken white lines. From the timespan for a vehicle to cross two consecutive virtual scanlines, we can calculate the average vehicle's velocity within that road segment. We also refine the speed estimation by detecting when a vehicle stops at a traffic light, and smooth the results with a moving average filter. Experiments on the dataset of Traffic Flow Analysis from NVIDIA AI City Challenge 2018 show that our method achieves the perfect detect rate of 100%, the average velocity difference of 6.9762 mph on freeways, and 8.9144 mph on both freeways and urban roads.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121065769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Estimating Attention of Faces Due to its Growing Level of Emotions 由于情绪水平的增长,估计面部的注意力
R. Kumar, Jogendra Garain, D. Kisku, G. Sanyal
{"title":"Estimating Attention of Faces Due to its Growing Level of Emotions","authors":"R. Kumar, Jogendra Garain, D. Kisku, G. Sanyal","doi":"10.1109/CVPRW.2018.00261","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00261","url":null,"abstract":"In the task of attending faces in the disciplined assembly (Like in examination hall or Silent public places), our gaze automatically goes towards those persons who exhibits their expression other than the normal expression. It happens due to finding of dissimilar expression among the gathering of normal. In order to modeling this concept in the intelligent vision of computer system, hardly some effective researches have been succeeded. Therefore, in this proposal we have tried to come out with a solution for handling such challenging task of computer vision. Actually, this problem is related to cognitive aspect of visual attention. In the literature of visual saliency authors have dealt with expressionless objects but it has not been addressed with object like face which exploits expressions. Visual saliency is a term which differentiates \"appealing\" visual substance from others, based on their feature differences. In this paper, in the set of multiple faces, 'Salient face' has been explored based on 'emotion deviation' from the normal. In the first phase of the experiment, face detection task has been accomplished using Viola Jones face detector. The concept of deep convolution neural network (CNN) has been applied for training and classification of different facial expression of emotions. Moreover, saliency score of every face of the input image have been computed by measuring their 'emotion score' which depends upon the deviation from the 'normal expression' scores. This proposed approach exhibits fairly good result which may give a new dimension to the researchers towards the modeling of an intelligent vision system which can be useful in the task of visual security and surveillance.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121171568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Rotated Rectangles for Symbolized Building Footprint Extraction 旋转矩形的符号化建筑足迹提取
Matt Dickenson, L. Gueguen
{"title":"Rotated Rectangles for Symbolized Building Footprint Extraction","authors":"Matt Dickenson, L. Gueguen","doi":"10.1109/CVPRW.2018.00039","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00039","url":null,"abstract":"Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). The CNN architecture outputs rotated rectangles, providing a symbolized approximation that works well for small buildings. Experiments are conducted on the four cities in the DeepGlobe Challenge dataset (Las Vegas, Paris, Shanghai, Khartoum). Our method performs best on suburbs consisting of individual houses. These experiments show that either large buildings or buildings without clear delineation produce weaker results in terms of precision and recall.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114727718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Crowd Activity Change Point Detection in Videos via Graph Stream Mining 基于图流挖掘的视频人群活动变化点检测
Meng Yang, Lida Rashidi, S. Rajasegarar, C. Leckie, A. S. Rao, M. Palaniswami
{"title":"Crowd Activity Change Point Detection in Videos via Graph Stream Mining","authors":"Meng Yang, Lida Rashidi, S. Rajasegarar, C. Leckie, A. S. Rao, M. Palaniswami","doi":"10.1109/CVPRW.2018.00059","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00059","url":null,"abstract":"In recent years, there has been a growing interest in detecting anomalous behavioral patterns in video. In this work, we address this task by proposing a novel activity change point detection method to identify crowd movement anomalies for video surveillance. In our proposed novel framework, a hyperspherical clustering algorithm is utilized for the automatic identification of interesting regions, then the density of pedestrian flows between every pair of interesting regions over consecutive time intervals is monitored and represented as a sequence of adjacency matrices where the direction and density of flows are captured through a directed graph. Finally, we use graph edit distance as well as a cumulative sum test to detect change points in the graph sequence. We conduct experiments on four real-world video datasets: Dublin, New Orleans, Abbey Road and MCG Datasets. We observe that our proposed approach achieves a high F-measure, i.e., in the range [0.7, 1], for these datasets. The evaluation reveals that our proposed method can successfully detect the change points in all datasets at both global and local levels. Our results also demonstrate the efficiency and effectiveness of our proposed algorithm for change point detection and segmentation tasks.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125845824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
AIC2018 Report: Traffic Surveillance Research AIC2018报告:交通监控研究
Tingyu Mao, Wei Zhang, Haoyu He, Yanjun Lin, Vinay Kale, Alexander Stein, Z. Kostić
{"title":"AIC2018 Report: Traffic Surveillance Research","authors":"Tingyu Mao, Wei Zhang, Haoyu He, Yanjun Lin, Vinay Kale, Alexander Stein, Z. Kostić","doi":"10.1109/CVPRW.2018.00019","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00019","url":null,"abstract":"Traffic surveillance and management technologies are some of the most intriguing aspects of smart city applications. In this paper, we investigate and present the methods for vehicle detections, tracking, speed estimation and anomaly detection for NVIDIA AI City Challenge 2018 (AIC2018). We applied Mask-RCNN and deep-sort for vehicle detection and tracking in track 1, and optical flow based method in track 2. In track 1, we achieve 100% detection rate and 7.97 mile/hour estimation error for speed estimation.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121964958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Improving Viseme Recognition Using GAN-Based Frontal View Mapping 基于gan的正面视图映射改进Viseme识别
Dario Augusto Borges Oliveira, Andréa Britto Mattos, E. Morais
{"title":"Improving Viseme Recognition Using GAN-Based Frontal View Mapping","authors":"Dario Augusto Borges Oliveira, Andréa Britto Mattos, E. Morais","doi":"10.1109/CVPRW.2018.00289","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00289","url":null,"abstract":"Deep learning methods have become the standard for Visual Speech Recognition problems due to their high accuracy results reported in the literature. However, while successful works have been reported for words and sentences, recognizing shorter segments of speech, like phones, has proven to be much more challenging due to the lack of temporal and contextual information. Also, head-pose variation remains a known issue for facial analysis with direct impact in this problem. In this context, we propose a novel methodology to tackle the problem of recognizing visemes – the visual equivalent of phonemes – using a GAN to artificially lock the face view into a perfect frontal view, reducing the view angle variability and simplifying the recognition task performed by our classification CNN. The GAN is trained using a large-scale synthetic 2D dataset based on realistic 3D facial models, automatically labelled for different visemes, mapping a slightly random view to a perfect frontal view. We evaluate our method using the GRID corpus, which was processed to extract viseme images and their corresponding synthetic frontal views to be further classified by our CNN model. Our results demonstrate that the additional synthetic frontal view is able to improve accuracy in 5.9% when compared with classification using the original image only.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115944890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信