Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher
{"title":"A mixed bag of emotions: Model, predict, and transfer emotion distributions","authors":"Kuan-Chuan Peng, Tsuhan Chen, Amir Sadovnik, Andrew C. Gallagher","doi":"10.1109/CVPR.2015.7298687","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298687","url":null,"abstract":"This paper explores two new aspects of photos and human emotions. First, we show through psychovisual studies that different people have different emotional reactions to the same image, which is a strong and novel departure from previous work that only records and predicts a single dominant emotion for each image. Our studies also show that the same person may have multiple emotional reactions to one image. Predicting emotions in “distributions” instead of a single dominant emotion is important for many applications. Second, we show not only that we can often change the evoked emotion of an image by adjusting color tone and texture related features but also that we can choose in which “emotional direction” this change occurs by selecting a target image. In addition, we present a new database, Emotion6, containing distributions of emotions.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114612528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition","authors":"Zhiwu Huang, Ruiping Wang, S. Shan, Xilin Chen","doi":"10.1109/CVPR.2015.7298609","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298609","url":null,"abstract":"In video based face recognition, great success has been made by representing videos as linear subspaces, which typically lie in a special type of non-Euclidean space known as Grassmann manifold. To leverage the kernel-based methods developed for Euclidean space, several recent methods have been proposed to embed the Grassmann manifold into a high dimensional Hilbert space by exploiting the well established Project Metric, which can approximate the Riemannian geometry of Grassmann manifold. Nevertheless, they inevitably introduce the drawbacks from traditional kernel-based methods such as implicit map and high computational cost to the Grassmann manifold. To overcome such limitations, we propose a novel method to learn the Projection Metric directly on Grassmann manifold rather than in Hilbert space. From the perspective of manifold learning, our method can be regarded as performing a geometry-aware dimensionality reduction from the original Grassmann manifold to a lower-dimensional, more discriminative Grassmann manifold where more favorable classification can be achieved. Experiments on several real-world video face datasets demonstrate that the proposed method yields competitive performance compared with the state-of-the-art algorithms.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117072637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face alignment using cascade Gaussian process regression trees","authors":"Donghoon Lee, Hyunsin Park, C. Yoo","doi":"10.1109/CVPR.2015.7299048","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299048","url":null,"abstract":"In this paper, we propose a face alignment method that uses cascade Gaussian process regression trees (cGPRT) constructed by combining Gaussian process regression trees (GPRT) in a cascade stage-wise manner. Here, GPRT is a Gaussian process with a kernel defined by a set of trees. The kernel measures the similarity between two inputs as the number of trees where the two inputs fall in the same leaves. Without increasing prediction time, the prediction of cGPRT can be performed in the same framework as the cascade regression trees (CRT) but with better generalization. Features for GPRT are designed using shape-indexed difference of Gaussian (DoG) filter responses sampled from local retinal patterns to increase stability and to attain robustness against geometric variances. Compared with the previous CRT-based face alignment methods that have shown state-of-the-art performances, cGPRT using shape-indexed DoG features performed best on the HELEN and 300-W datasets which are the most challenging dataset today.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116087550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xianming Liu, R. Ji, Changhu Wang, W. Liu, Bineng Zhong, Thomas S. Huang
{"title":"Understanding image structure via hierarchical shape parsing","authors":"Xianming Liu, R. Ji, Changhu Wang, W. Liu, Bineng Zhong, Thomas S. Huang","doi":"10.1109/CVPR.2015.7299139","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299139","url":null,"abstract":"Exploring image structure is a long-standing yet important research subject in the computer vision community. In this paper, we focus on understanding image structure inspired by the “simple-to-complex” biological evidence. A hierarchical shape parsing strategy is proposed to partition and organize image components into a hierarchical structure in the scale space. To improve the robustness and flexibility of image representation, we further bundle the image appearances into hierarchical parsing trees. Image descriptions are subsequently constructed by performing a structural pooling, facilitating efficient matching between the parsing trees. We leverage the proposed hierarchical shape parsing to study two exemplar applications including edge scale refinement and unsupervised “objectness” detection. We show competitive parsing performance comparing to the state-of-the-arts in above scenarios with far less proposals, which thus demonstrates the advantage of the proposed parsing scheme.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115373822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GRSA: Generalized range swap algorithm for the efficient optimization of MRFs","authors":"Kangwei Liu, Junge Zhang, Peipei Yang, Kaiqi Huang","doi":"10.1109/CVPR.2015.7298785","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298785","url":null,"abstract":"Markov Random Field (MRF) is an important tool and has been widely used in many vision tasks. Thus, the optimization of MRFs is a problem of fundamental importance. Recently, Veskler and Kumar et. al propose the range move algorithms, which are one of the most successful solvers to this problem. However, two problems have limited the applicability of previous range move algorithms: 1) They are limited in the types of energies they can handle (i.e. only truncated convex functions); 2) These algorithms tend to be very slow compared to other graph-cut based algorithms (e.g. α-expansion and αβ-swap). In this paper, we propose a generalized range swap algorithm (GRSA) for efficient optimization of MRFs. To address the first problem, we extend the GRSA to arbitrary semimetric energies by restricting the chosen labels in each move so that the energy is submodular on the chosen subset. Furthermore, to feasibly choose the labels satisfying the submodular condition, we provide a sufficient condition of the submodularity. For the second problem, unlike previous range move algorithms which execute the set of all possible range moves, we dynamically obtain the iterative moves by solving a set cover problem, which greatly reduces the number of moves during the optimization. Experiments show that the GRSA offers a great speedup over previous range swap algorithms, while it obtains competitive solutions.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"47 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115429578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based simplex method for pairwise energy minimization with binary variables","authors":"D. Prusa","doi":"10.1109/CVPR.2015.7298645","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298645","url":null,"abstract":"We show how the simplex algorithm can be tailored to the linear programming relaxation of pairwise energy minimization with binary variables. A special structure formed by basic and nonbasic variables in each stage of the algorithm is identified and utilized to perform the whole iterative process combinatorially over the input energy minimization graph rather than algebraically over the simplex tableau. This leads to a new efficient solver. We demonstrate that for some computer vision instances it performs even better than methods reducing binary energy minimization to finding maximum flow in a network.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123216832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Ma, Xiaokang Yang, Chongyang Zhang, Ming-Hsuan Yang
{"title":"Long-term correlation tracking","authors":"Chao Ma, Xiaokang Yang, Chongyang Zhang, Ming-Hsuan Yang","doi":"10.1109/CVPR.2015.7299177","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7299177","url":null,"abstract":"In this paper, we address the problem of long-term visual tracking where the target objects undergo significant appearance variation due to deformation, abrupt motion, heavy occlusion and out-of-view. In this setting, we decompose the task of tracking into translation and scale estimation of objects. We show that the correlation between temporal context considerably improves the accuracy and reliability for translation estimation, and it is effective to learn discriminative correlation filters from the most confident frames to estimate the scale change. In addition, we train an online random fern classifier to re-detect objects in case of tracking failure. Extensive experimental results on large-scale benchmark datasets show that the proposed algorithm performs favorably against state-of-the-art methods in terms of efficiency, accuracy, and robustness.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123451636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, C. Rosenberg, Li Fei-Fei
{"title":"Learning semantic relationships for better action retrieval in images","authors":"Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, C. Rosenberg, Li Fei-Fei","doi":"10.1109/CVPR.2015.7298713","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298713","url":null,"abstract":"Human actions capture a wide variety of interactions between people and objects. As a result, the set of possible actions is extremely large and it is difficult to obtain sufficient training examples for all actions. However, we could compensate for this sparsity in supervision by leveraging the rich semantic relationship between different actions. A single action is often composed of other smaller actions and is exclusive of certain others. We need a method which can reason about such relationships and extrapolate unobserved actions from known actions. Hence, we propose a novel neural network framework which jointly extracts the relationship between actions and uses them for training better action retrieval models. Our model incorporates linguistic, visual and logical consistency based cues to effectively identify these relationships. We train and test our model on a largescale image dataset of human actions. We show a significant improvement in mean AP compared to different baseline methods including the HEX-graph approach from Deng et al. [8].","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124483733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seungryong Kim, Dongbo Min, Bumsub Ham, Seungchul Ryu, M. Do, K. Sohn
{"title":"DASC: Dense adaptive self-correlation descriptor for multi-modal and multi-spectral correspondence","authors":"Seungryong Kim, Dongbo Min, Bumsub Ham, Seungchul Ryu, M. Do, K. Sohn","doi":"10.1109/CVPR.2015.7298822","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298822","url":null,"abstract":"Establishing dense visual correspondence between multiple images is a fundamental task in many applications of computer vision and computational photography. Classical approaches, which aim to estimate dense stereo and optical flow fields for images adjacent in viewpoint or in time, have been dramatically advanced in recent studies. However, finding reliable visual correspondence in multi-modal or multi-spectral images still remains unsolved. In this paper, we propose a novel dense matching descriptor, called dense adaptive self-correlation (DASC), to effectively address this kind of matching scenarios. Based on the observation that a self-similarity existing within images is less sensitive to modality variations, we define the descriptor with a series of an adaptive self-correlation similarity for patches within a local support window. To further improve the matching quality and runtime efficiency, we propose a randomized receptive field pooling, in which a sampling pattern is optimized with a discriminative learning. Moreover, the computational redundancy that arises when computing densely sampled descriptor over an entire image is dramatically reduced by applying fast edge-aware filtering. Experiments demonstrate the outstanding performance of the DASC descriptor in many cases of multi-modal and multi-spectral correspondence.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124687610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaochun Cao, Changqing Zhang, H. Fu, Si Liu, Hua Zhang
{"title":"Diversity-induced Multi-view Subspace Clustering","authors":"Xiaochun Cao, Changqing Zhang, H. Fu, Si Liu, Hua Zhang","doi":"10.1109/CVPR.2015.7298657","DOIUrl":"https://doi.org/10.1109/CVPR.2015.7298657","url":null,"abstract":"In this paper, we focus on how to boost the multi-view clustering by exploring the complementary information among multi-view features. A multi-view clustering framework, called Diversity-induced Multi-view Subspace Clustering (DiMSC), is proposed for this task. In our method, we extend the existing subspace clustering into the multi-view domain, and utilize the Hilbert Schmidt Independence Criterion (HSIC) as a diversity term to explore the complementarity of multi-view representations, which could be solved efficiently by using the alternating minimizing optimization. Compared to other multi-view clustering methods, the enhanced complementarity reduces the redundancy between the multi-view representations, and improves the accuracy of the clustering results. Experiments on both image and video face clustering well demonstrate that the proposed method outperforms the state-of-the-art methods.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124810743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}