{"title":"Unconstrained Face Alignment Without Face Detection","authors":"Xiaohu Shao, Junliang Xing, Jiang-Jing Lv, C. Xiao, Pengcheng Liu, Youji Feng, Cheng Cheng","doi":"10.1109/CVPRW.2017.258","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.258","url":null,"abstract":"This paper introduces our submission to the 2nd Facial Landmark Localisation Competition. We present a deep architecture to directly detect facial landmarks without using face detection as an initialization. The architecture consists of two stages, a Basic Landmark Prediction Stage and a Whole Landmark Regression Stage. At the former stage, given an input image, the basic landmarks of all faces are detected by a sub-network of landmark heatmap and affinity field prediction. At the latter stage, the coarse canonical face and the pose can be generated by a Pose Splitting Layer based on the visible basic landmarks. According to its pose, each canonical state is distributed to the corresponding branch of the shape regression sub-networks for the whole landmark detection. Experimental results show that our method obtains promising results on the 300-W dataset, and achieves superior performances over the baselines of the semi-frontal and the profile categories in this competition.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"2069-2077"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83230703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, W. Dally
{"title":"Exploring the Granularity of Sparsity in Convolutional Neural Networks","authors":"Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, W. Dally","doi":"10.1109/CVPRW.2017.241","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.241","url":null,"abstract":"Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with zeros. The granularity of sparsity affects the efficiency of hardware architecture and the prediction accuracy. In this paper we quantitatively measure the accuracy-sparsity relationship with different granularity. Coarse-grained sparsity brings more regular sparsity pattern, making it easier for hardware acceleration, and our experimental results show that coarsegrained sparsity have very small impact on the sparsity ratio given no loss of accuracy. Moreover, due to the index saving effect, coarse-grained sparsity is able to obtain similar or even better compression rates than fine-grained sparsity at the same accuracy threshold. Our analysis, which is based on the framework of a recent sparse convolutional neural network (SCNN) accelerator, further demonstrates that it saves 30% – 35% of memory references compared with fine-grained sparsity.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"66 1","pages":"1927-1934"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75816539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mario Rodríguez, C. Orrite-Uruñuela, C. Medrano, D. Makris
{"title":"Fast Simplex-HMM for One-Shot Learning Activity Recognition","authors":"Mario Rodríguez, C. Orrite-Uruñuela, C. Medrano, D. Makris","doi":"10.1109/CVPRW.2017.166","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.166","url":null,"abstract":"The work presented in this paper deals with the challenging task of learning an activity class representation using a single sequence for training. Recently, Simplex-HMM framework has been shown to be an efficient representation for activity classes, however, it presents high computational costs making it impractical in several situations. A dimensionality reduction of the features spaces based on a Maximum at Posteriori adaptation combined with a fast estimation of the optimal parameters in the Expectation Maximization algorithm are presented in this paper. As confirmed by the experimental results, these two modifications not only reduce the computational cost but also maintain the performance or even improve it. The process suitability is experimentally confirmed using the human activity datasets Weizmann, KTH and IXMAS and the gesture dataset ChaLearn.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"35 1","pages":"1259-1266"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81079084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Ancuti, Cosmin Ancuti, C. Vleeschouwer, Rafael García
{"title":"Locally Adaptive Color Correction for Underwater Image Dehazing and Matching","authors":"C. Ancuti, Cosmin Ancuti, C. Vleeschouwer, Rafael García","doi":"10.1109/CVPRW.2017.136","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.136","url":null,"abstract":"Underwater images are known to be strongly deteriorated by a combination of wavelength-dependent light attenuation and scattering. This results in complex color casts that depend both on the scene depth map and on the light spectrum. Color transfer, which is a technique of choice to counterbalance color casts, assumes stationary casts, defined by global parameters, and is therefore not directly applicable to the locally variable color casts encountered in underwater scenarios. To fill this gap, this paper introduces an original fusion-based strategy to exploit color transfer while tuning the color correction locally, as a function of the light attenuation level estimated from the red channel. The Dark Channel Prior (DCP) is then used to restore the color compensated image, by inverting the simplified Koschmieder light transmission model, as for outdoor dehazing. Our technique enhances image contrast in a quite effective manner and also supports accurate transmission map estimation. Our extensive experiments also show that our color correction strongly improves the effectiveness of local keypoints matching.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"26 3 1","pages":"997-1005"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78820364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Earth Observation Using SAR and Social Media Images","authors":"Yuanyuan Wang, Xiaoxiang Zhu","doi":"10.1109/CVPRW.2017.202","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.202","url":null,"abstract":"Earth Observation (EO) is mostly carried out through centralized optical and synthetic aperture radar (SAR) missions. Despite the controlled quality of their products, such observation is restricted by the characteristics of the sensor platform, e.g. the revisit time. Over the last decade, the rapid development of social media has accumulated vast amount of online images. Despite their uncontrolled quality, the sheer volume may contain useful information that can complement the EO missions, especially the SAR missions.,,,,,, This paper presents a preliminary work of fusing social media and SAR images. They have distinct imaging geometries, which are nearly impossible to even coregister without a precise 3-D model. We describe a general approach to coregister them without using external 3-D model. We demonstrate that, one can obtain a new kind of 3-D city model that includes the optical texture for better scene understanding and the precise deformation retrieved from SAR interferometry.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"21 1","pages":"1580-1588"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90184514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Zois, Ilias Theodorakopoulos, Dimitrios Tsourounis, G. Economou
{"title":"Parsimonious Coding and Verification of Offline Handwritten Signatures","authors":"E. Zois, Ilias Theodorakopoulos, Dimitrios Tsourounis, G. Economou","doi":"10.1109/CVPRW.2017.92","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.92","url":null,"abstract":"A common practice for addressing the problem of verifying the presence, or the consent of a person in many transactions is to utilize the handwritten signature. Among others, the offline or static signature is a valuable tool in forensic related studies. Thus, the importance of verifying static handwritten signatures still poses a challenging task. Throughout the literature, gray-level images, composed of handwritten signature traces are subjected to numerous processing stages; their outcome is the mapping of any input signature image in a so-called corresponding feature space. Pattern recognition techniques utilize this feature space, usually as a binary verification problem. In this work, sparse dictionary learning and coding are for the first time employed as a means to provide a feature space for offline signature verification, which intuitively adapts to a small set of randomly selected genuine reference samples, thus making it attractable for forensic cases. In this context, the K-SVD dictionary learning algorithm is employed in order to create a writer oriented lexicon. For any signature sample, sparse representation with the use of the writer's lexicon and the Orthogonal Matching Pursuit algorithm generates a weight matrix; features are then extracted by applying simple average pooling to the generated sparse codes. The performance of the proposed scheme is demonstrated using the popular CEDAR, MCYT75 and GPDS300 signature datasets, delivering state of the art results.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"36 1","pages":"636-645"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89923691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Caught Red-Handed: Toward Practical Video-Based Subsequences Matching in the Presence of Real-World Transformations","authors":"Yi Xu, True Price, F. Monrose, Jan-Michael Frahm","doi":"10.1109/CVPRW.2017.182","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.182","url":null,"abstract":"Every minute, staggering amounts of user-generated videos are uploaded to on-line social networks. These videos can generate significant advertising revenue, providing strong incentive for unscrupulous individuals that wish to capitalize on this bonanza by pirating short clips from popular content and altering the copied media in ways that might bypass detection. Unfortunately, while the challenges posed by the use of skillful transformations has been known for quite some time, current state-of-the-art methods still suffer from severe limitations. Indeed, most of today's techniques perform poorly in the face of real world copies. To address this, we propose a novel approach that leverages temporal characteristics to identify subsequences of a video that were copied from elsewhere. Our approach takes advantage of a new temporal feature to index a reference library in a manner that is robust to popular spatial and temporal transformations in pirated videos. Our experimental evaluation on 27 hours of video obtained from social networks demonstrates that our technique significantly outperforms the existing state-of-the-art approaches with respect to accuracy, resilience, and efficiency.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"51 1","pages":"1397-1406"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89160497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human Activity Recognition Using Combinatorial Deep Belief Networks","authors":"Shreyank N. Gowda","doi":"10.1109/CVPRW.2017.203","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.203","url":null,"abstract":"Human activity recognition is a topic undergoing a great amount of research. The main reason for that is the number of practical applications that are developed using activity recognition as the base. This paper proposes an approach to human activity recognition using a combination of deep belief networks. One network is used to obtain features from motion and to do this we propose a modified Weber descriptor. Another network is used to obtain features from images and to do this we propose the modification of the standard local binary patterns descriptor to obtain a concatenated histogram of lower dimensions. This helps to encode spatial and temporal information of various actions happening in a frame. This further helps to overcome the dimensionality problem that occurs with LBP. The features extracted are then passed onto a CNN that classifies the activity. Few standard activities are considered such as walking, sprinting, hugging etc. Results showed that the proposed algorithm gave a high level of accuracy for classification.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"1589-1594"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76460459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Cost-Effective Framework for Automated Vehicle-Pedestrian Near-Miss Detection Through Onboard Monocular Vision","authors":"Ruimin Ke, J. Lutin, J. Spears, Yinhai Wang","doi":"10.1109/CVPRW.2017.124","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.124","url":null,"abstract":"Onboard monocular cameras have been widely deployed in both public transit and personal vehicles. Obtaining vehicle-pedestrian near-miss event data from onboard monocular vision systems may be cost-effective compared with onboard multiple-sensor systems or traffic surveillance videos. But extracting near-misses from onboard monocular vision is challenging and little work has been published. This paper fills the gap by developing a framework to automatically detect vehicle-pedestrian near-misses through onboard monocular vision. The proposed framework can estimate depth and real-world motion information through monocular vision with a moving video background. The experimental results based on processing over 30-hours video data demonstrate the ability of the system to capture near-misses by comparison with the events logged by the Rosco/MobilEye Shield+ system which includes four cameras working cooperatively. The detection overlap rate reaches over 90% with the thresholds properly set.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"97 1","pages":"898-905"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79938720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eman T. Hassan, Rakibul Hasan, Patrick Shaffer, David J. Crandall, Apu Kapadia
{"title":"Cartooning for Enhanced Privacy in Lifelogging and Streaming Videos","authors":"Eman T. Hassan, Rakibul Hasan, Patrick Shaffer, David J. Crandall, Apu Kapadia","doi":"10.1109/CVPRW.2017.175","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.175","url":null,"abstract":"We describe an object replacement approach whereby privacy-sensitive objects in videos are replaced by abstract cartoons taken from clip art. Our approach uses a combination of computer vision, deep learning, and image processing techniques to detect objects, abstract details, and replace them with cartoon clip art. We conducted a user study (N=85) to discern the utility and effectiveness of our cartoon replacement technique. The results suggest that our object replacement approach preserves a video's semantic content while improving its privacy by obscuring details of objects.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"20 1","pages":"1333-1342"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91366679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}