2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

筛选
英文 中文
Increasing Video Saliency Model Generalizability by Training for Smooth Pursuit Prediction 通过训练提高视频显著性模型的泛化性,用于平滑追踪预测
Mikhail Startsev, M. Dorr
{"title":"Increasing Video Saliency Model Generalizability by Training for Smooth Pursuit Prediction","authors":"Mikhail Startsev, M. Dorr","doi":"10.1109/CVPRW.2018.00264","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00264","url":null,"abstract":"Saliency prediction even for videos is traditionally associated with fixation prediction. Unlike images, however, videos also induce smooth pursuit eye movements, for example when a salient object is moving and is tracked across the video surface. Nevertheless, current saliency data sets and models mostly ignore pursuit, either by combining it with fixations, or discarding the respective samples. In this work, we utilize a state-of-the-art smooth pursuit detector and a Slicing Convolutional Neural Network (S-CNN) to train two saliency models, one targeting fixation prediction and the other targeting smooth pursuit. We hypothesize that pursuit-salient video parts would generalize better, since the motion patterns should be relatively similar across data sets. To test this, we consider an independent video saliency data set, where no pursuit-fixation differentiation is performed. In our experiments, the pursuit-targeting model outperforms several state-of-the-art saliency algorithms on both the test part of our main data set and the additionally considered data set.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130273864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Covariance Matrices Encoding Based on the Log-Euclidean and Affine Invariant Riemannian Metrics 基于对数欧几里德和仿射不变黎曼度量的协方差矩阵编码
Ioana Ilea, L. Bombrun, S. Said, Y. Berthoumieu
{"title":"Covariance Matrices Encoding Based on the Log-Euclidean and Affine Invariant Riemannian Metrics","authors":"Ioana Ilea, L. Bombrun, S. Said, Y. Berthoumieu","doi":"10.1109/CVPRW.2018.00080","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00080","url":null,"abstract":"This paper presents coding methods used to encode a set of covariance matrices. Starting from a Gaussian mixture model adapted to the log-Euclidean or affine invariant Riemannian metric, we propose a Fisher Vector (FV) descriptor adapted to each of these metrics: the log Euclidean FV (LE FV) and the Riemannian Fisher Vector (RFV). An experiment is conducted on four conventional texture databases to compare these two metrics and to illustrate the potential of these FV based descriptors compared to state-of-the-art BoW and VLAD based descriptors. A focus is also done to illustrate the advantage of using the Fisher information matrix during the derivation of the FV.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132929608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
RPIfield: A New Dataset for Temporally Evaluating Person Re-identification RPIfield:一个用于时间评估人员再识别的新数据集
Meng Zheng, S. Karanam, R. Radke
{"title":"RPIfield: A New Dataset for Temporally Evaluating Person Re-identification","authors":"Meng Zheng, S. Karanam, R. Radke","doi":"10.1109/CVPRW.2018.00251","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00251","url":null,"abstract":"The operational aspects of real-world human re-identification are typically oversimplified in academic research. Specifically, re-id algorithms are evaluated by matching probe images to candidates from a fixed gallery collected at the end of a video, ignoring the arrival time of each candidate. However, in real-world applications like crime prevention, a re-id system would likely operate in real time, and might be in continuous operation for several days. It would be natural to provide the user of such a system with instantaneous ranked lists from the current gallery candidates rather than waiting for a collective list after processing the whole video sequence. Re-id algorithms thus need to be evaluated based on their temporal performance on a dynamic gallery populated by an increasing number of candidates (some of whom may return several times over a long duration). This aspect of the problem is difficult to study with current benchmarking re-id datasets since they lack time-stamp information. In this paper, we introduce a new multi-shot re-id dataset, called RPIfield, which provides explicit time-stamp information for each candidate. The RPIfield dataset is comprised of 12 outdoor camera videos, with 112 known actors walking along pre-specified paths among about 4000 distractors. Each actor in RPIfield has multiple reappearances in one or more camera views, which allows the study of re-id algorithms in a more general context, especially with respect to temporal aspects.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"157 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133686538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Implementing a Robust Explanatory Bias in a Person Re-identification Network 在人再识别网络中实现稳健的解释偏差
Esube Bekele, W. Lawson, Z. Horne, S. Khemlani
{"title":"Implementing a Robust Explanatory Bias in a Person Re-identification Network","authors":"Esube Bekele, W. Lawson, Z. Horne, S. Khemlani","doi":"10.1109/CVPRW.2018.00291","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00291","url":null,"abstract":"Deep learning improved attributes recognition significantly in recent years. However, many of these networks remain \"black boxes\" and providing a meaningful explanation of their decisions is a major challenge. When these networks misidentify a person, they should be able to explain this mistake. The ability to generate explanations compelling enough to serve as useful accounts of the system's operations at a very high human-level is still in its infancy. In this paper, we utilize person re-identification (re-ID) networks as a platform to generate explanations. We propose and implement a framework that can be used to explain person re-ID using soft-biometric attributes. In particular, the resulting framework embodies a cognitively validated explanatory bias: people prefer and produce explanations that concern inherent properties instead of extrinsic influences. This bias is pervasive in that it affects the fitness of explanations across a broad swath of contexts, particularly those that concern conflicting or anomalous observations. To explain person re-ID, we developed a multiattribute residual network that treats a subset of its features as either inherent or extrinsic. Using these attributes, the system generates explanations based on inherent properties when the similarity of two input images is low, and it generates explanations based on extrinsic properties when the similarity is high. We argue that such a framework provides a blueprint for how to make the decisions of deep networks comprehensible to human operators. As an intermediate step, we demonstrate state-of-the-art attribute recognition performance on two pedestrian datasets (PETA and PA100K) and a face-based attribute dataset (CelebA). The VIPeR dataset is then used to generate explanations for re-ID with a network trained on PETA attributes.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"177 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124403935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Relating Deep Neural Network Representations to EEG-fMRI Spatiotemporal Dynamics in a Perceptual Decision-Making Task 感知决策任务中深层神经网络表征与EEG-fMRI时空动态的关系
Tao Tu, Jonathan Koss, P. Sajda
{"title":"Relating Deep Neural Network Representations to EEG-fMRI Spatiotemporal Dynamics in a Perceptual Decision-Making Task","authors":"Tao Tu, Jonathan Koss, P. Sajda","doi":"10.1109/CVPRW.2018.00267","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00267","url":null,"abstract":"The hierarchical architecture of deep convolutional neural networks (CNN) resembles the multi-level processing stages of the human visual system during object recognition. Converging evidence suggests that this hierarchical organization is key to the CNN achieving human-level performance in object categorization [22]. In this paper, we leverage the hierarchical organization of the CNN to investigate the spatiotemporal dynamics of rapid visual processing in the human brain. Specifically we focus on perceptual decisions associated with different levels of visual ambiguity. Using simultaneous EEG-fMRI, we demonstrate the temporal and spatial hierarchical correspondences between the multi-stage processing in CNN and the activity observed in the EEG and fMRI. The hierarchical correspondence suggests a processing pathway during rapid visual decision-making that involves the interplay between sensory regions, the default mode network (DMN) and the frontal-parietal control network (FPCN).","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114649004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Estimation of Center of Mass for Sports Scene Using Weighted Visual Hull 基于加权视觉船体的运动场景质心估计
Tomoya Kaichi, Shohei Mori, H. Saito, Kosuke Takahashi, Dan Mikami, Mariko Isogawa, H. Kimata
{"title":"Estimation of Center of Mass for Sports Scene Using Weighted Visual Hull","authors":"Tomoya Kaichi, Shohei Mori, H. Saito, Kosuke Takahashi, Dan Mikami, Mariko Isogawa, H. Kimata","doi":"10.1109/CVPRW.2018.00234","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00234","url":null,"abstract":"This paper presents a method to estimate the 3D position of a center of mass (CoM) of a human body from a set of multi-view images. As a well-known fact, in sports, collections of CoM are important for analyzing the athletes' performance. Most conventional studies in CoM estimation require installing a measuring system (e.g., a force plate or optical motion capture system) or attaching sensors to the athlete. While such systems reliably estimate CoM, casual settings are preferable for simplifying preparations. To address this issue, the proposed method takes a vision-based approach that does not require specialized hardware and wearable devices. Our method calculates subject's CoM using voxels with body parts dependent weighting. This individual voxel reconstruction and voxel-wise weighting reflects the differences in each body shape, and are expected to contribute to higher performance in analysis. The results using real data demonstrated the performance of the proposed method were compared to force plate data, and provided a 3D CoM visualization in a dynamic scene.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116925826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Traffic Speed Estimation from Surveillance Video Data: For the 2nd NVIDIA AI City Challenge Track 1 基于监控视频数据的交通速度估计:第二届NVIDIA AI城市挑战赛
Tingting Huang
{"title":"Traffic Speed Estimation from Surveillance Video Data: For the 2nd NVIDIA AI City Challenge Track 1","authors":"Tingting Huang","doi":"10.1109/CVPRW.2018.00029","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00029","url":null,"abstract":"Estimating traffic flow condition is a tough but beneficial task. In Intelligent Transportation System (ITS), many applications have been done to collect and analyze traffic data. However, the surveillance video data are still only used for engineer's manual check. To better utilize this data source, traffic flow estimation from surveillance camera should be explored. This study uses Faster Regional Convolutional Neural Network (Faster R-CNN) with ResNet 101 as the backbone to achieve multi-object detection. Then a tracking algorithm based on histogram comparison is applied to link objects across frames. Finally, this study uses warping method to convert vehicle speeds from the pixel domain to the real world. The results show that estimating vehicle speed at intersection is more challenging than in uninterrupted flow.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125757477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets 基于无监督精细网络的单目RGB手部姿态推断
E. Dibra, Silvan Melchior, A. Balkis, Thomas Wolf, C. Öztireli, M. Gross
{"title":"Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets","authors":"E. Dibra, Silvan Melchior, A. Balkis, Thomas Wolf, C. Öztireli, M. Gross","doi":"10.1109/CVPRW.2018.00155","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00155","url":null,"abstract":"3D hand pose inference from monocular RGB data is a challenging problem. CNN-based approaches have shown great promise in tackling this problem. However, such approaches are data-hungry, and obtaining real labeled training hand data is very hard. To overcome this, in this work, we propose a new, large, realistically rendered hand dataset and a neural network trained on it, with the ability to refine itself unsupervised on real unlabeled RGB images, given corresponding depth images. We benchmark and validate our method on existing and captured datasets, demonstrating that we strongly compare to or outperform state-of-the-art methods for various tasks ranging from 3D pose estimation to hand gesture recognition.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129774940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Latent Fingerprint Image Quality Assessment Using Deep Learning 基于深度学习的潜在指纹图像质量评估
J. Ezeobiejesi, B. Bhanu
{"title":"Latent Fingerprint Image Quality Assessment Using Deep Learning","authors":"J. Ezeobiejesi, B. Bhanu","doi":"10.1109/CVPRW.2018.00092","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00092","url":null,"abstract":"Latent fingerprints are fingerprint impressions unintentionally left on surfaces at a crime scene. They are crucial in crime scene investigations for making identifications or exclusions of suspects. Determining the quality of latent fingerprint images is crucial to the effectiveness and reliability of matching algorithms. To alleviate the inconsistency and subjectivity inherent in feature markups by latent fingerprint examiners, automatic processing of latent fingerprints is imperative. We propose a deep neural network that predicts the quality of image patches extracted from a latent fingerprint and knits them together to predict the quality of a given latent fingerprint. The proposed approach eliminates the need for manual ROI markup and manual feature markup by latent examiners. Experimental results on NIST SD27 show the effectiveness of our technique in latent fingerprint quality prediction.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128614613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Exploring the Feasibility of Face Video Based Instantaneous Heart-Rate for Micro-Expression Spotting 探索基于人脸视频的瞬时心率微表情识别的可行性
Puneet Gupta, B. Bhowmick, A. Pal
{"title":"Exploring the Feasibility of Face Video Based Instantaneous Heart-Rate for Micro-Expression Spotting","authors":"Puneet Gupta, B. Bhowmick, A. Pal","doi":"10.1109/CVPRW.2018.00179","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00179","url":null,"abstract":"Facial micro-expressions (ME) are manifested by human reflexive behavior and thus they are useful to disclose the genuine human emotions. Their analysis plays a pivotal role in many real-world applications encompassing affective computing, biometrics and psychotherapy. The first and foremost step for ME analysis is ME spotting which refers to detection of ME affected frames from a video. ME spotting is a highly challenging research problem and even human experts cannot correctly perform it because MEs are manifested using subtle face deformations and that too for a short duration. It is well established that changes in the human emotions, not only manifest ME but it also introduces changes in instantaneous heart rate. Thus, the manifestation of ME and changes in the instantaneous heart rate are related to the change in human expressions and both of them are estimated using temporal deformations of the face. This provides the motivation of this paper that aims to explore the feasibility of variations in the instantaneous heart rate for performing the correct the ME spotting. Experimental results conducted on a publicly available spontaneous ME spotting dataset, reveal that the variations in instantaneous heart rate can be utilized to improve the ME spotting.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129342347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信