Eduard Serradell, Adriana Romero, R. Leta, C. Gatta, F. Moreno-Noguer
{"title":"Simultaneous correspondence and non-rigid 3D reconstruction of the coronary tree from single X-ray images","authors":"Eduard Serradell, Adriana Romero, R. Leta, C. Gatta, F. Moreno-Noguer","doi":"10.1109/ICCV.2011.6126325","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126325","url":null,"abstract":"We present a novel approach to simultaneously reconstruct the 3D structure of a non-rigid coronary tree and estimate point correspondences between an input X-ray image and a reference 3D shape. At the core of our approach lies an optimization scheme that iteratively fits a generative 3D model of increasing complexity and guides the matching process. As a result, and in contrast to existing approaches that assume rigidity or quasi-rigidity of the structure, our method is able to retrieve large non-linear deformations even when the input data is corrupted by the presence of noise and partial occlusions. We extensively evaluate our approach under synthetic and real data and demonstrate a remarkable improvement compared to state-of-the-art.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"9 1","pages":"850-857"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74275456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Horesh Ben Shitrit, J. Berclaz, F. Fleuret, P. Fua
{"title":"Tracking multiple people under global appearance constraints","authors":"Horesh Ben Shitrit, J. Berclaz, F. Fleuret, P. Fua","doi":"10.1109/ICCV.2011.6126235","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126235","url":null,"abstract":"In this paper, we show that tracking multiple people whose paths may intersect can be formulated as a convex global optimization problem. Our proposed framework is designed to exploit image appearance cues to prevent identity switches. Our method is effective even when such cues are only available at distant time intervals. This is unlike many current approaches that depend on appearance being exploitable from frame to frame. We validate our approach on three multi-camera sport and pedestrian datasets that contain long and complex sequences. Our algorithm perseveres identities better than state-of-the-art algorithms while keeping similar MOTA scores.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"79 1","pages":"137-144"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75385237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Marchesotti, F. Perronnin, Diane Larlus, G. Csurka
{"title":"Assessing the aesthetic quality of photographs using generic image descriptors","authors":"L. Marchesotti, F. Perronnin, Diane Larlus, G. Csurka","doi":"10.1109/ICCV.2011.6126444","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126444","url":null,"abstract":"In this paper, we automatically assess the aesthetic properties of images. In the past, this problem has been addressed by hand-crafting features which would correlate with best photographic practices (e.g. “Does this image respect the rule of thirds?”) or with photographic techniques (e.g. “Is this image a macro?”). We depart from this line of research and propose to use generic image descriptors to assess aesthetic quality. We experimentally show that the descriptors we use, which aggregate statistics computed from low-level local features, implicitly encode the aesthetic properties explicitly used by state-of-the-art methods and outperform them by a significant margin.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"180 1","pages":"1784-1791"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75471472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gaussian process regression flow for analysis of motion trajectories","authors":"Kihwan Kim, Dongryeol Lee, Irfan Essa","doi":"10.1109/ICCV.2011.6126365","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126365","url":null,"abstract":"Recognition of motions and activities of objects in videos requires effective representations for analysis and matching of motion trajectories. In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression. Furthermore, we introduce a random sampling strategy for learning stable classes of motions from limited data. Our representation allows for incrementally predicting possible paths and detecting anomalous events from online trajectories. This representation also supports matching of complex motions with acceleration changes and pauses or stops within a trajectory. We use the proposed approach for classifying and predicting motion trajectories in traffic monitoring domains and test on several data sets. We show that our approach works well on various types of complete and incomplete trajectories from a variety of video data sets with different frame rates.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"36 1","pages":"1164-1171"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74557672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised and semi-supervised learning via ℓ1-norm graph","authors":"F. Nie, Hua Wang, Heng Huang, C. Ding","doi":"10.1109/ICCV.2011.6126506","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126506","url":null,"abstract":"In this paper, we propose a novel ℓ1-norm graph model to perform unsupervised and semi-supervised learning methods. Instead of minimizing the ℓ2-norm of spectral embedding as traditional graph based learning methods, our new graph learning model minimizes the ℓ1-norm of spectral embedding with well motivation. The sparsity produced by the ℓ1-norm minimization results in the solutions with much clearer cluster structures, which are suitable for both image clustering and classification tasks. We introduce a new efficient iterative algorithm to solve the ℓ1-norm of spectral embedding minimization problem, and prove the convergence of the algorithm. More specifically, our algorithm adaptively re-weight the original weights of graph to discover clearer cluster structure. Experimental results on both toy data and real image data sets show the effectiveness and advantages of our proposed method.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"14 1","pages":"2268-2273"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74597016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Color photometric stereo for multicolored surfaces","authors":"Robert Anderson, B. Stenger, R. Cipolla","doi":"10.1109/ICCV.2011.6126495","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126495","url":null,"abstract":"We present a multispectral photometric stereo method for capturing geometry of deforming surfaces. A novel photometric calibration technique allows calibration of scenes containing multiple piecewise constant chromaticities. This method estimates per-pixel photometric properties, then uses a RANSAC-based approach to estimate the dominant chromaticities in the scene. A likelihood term is developed linking surface normal, image intensity and photometric properties, which allows estimating the number of chromaticities present in a scene to be framed as a model estimation problem. The Bayesian Information Criterion is applied to automatically estimate the number of chromaticities present during calibration. A two-camera stereo system provides low resolution geometry, allowing the likelihood term to be used in segmenting new images into regions of constant chromaticity. This segmentation is carried out in a Markov Random Field framework and allows the correct photometric properties to be used at each pixel to estimate a dense normal map. Results are shown on several challenging real-world sequences, demonstrating state-of-the-art results using only two cameras and three light sources. Quantitative evaluation is provided against synthetic ground truth data.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"27 1","pages":"2182-2189"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74741606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhihu Chen, Kwan-Yee Kenneth Wong, Y. Matsushita, Xiaolong Zhu, Miaomiao Liu
{"title":"Self-calibrating depth from refraction","authors":"Zhihu Chen, Kwan-Yee Kenneth Wong, Y. Matsushita, Xiaolong Zhu, Miaomiao Liu","doi":"10.1109/ICCV.2011.6126298","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126298","url":null,"abstract":"In this paper, we introduce a novel method for depth acquisition based on refraction of light. A scene is captured twice by a fixed perspective camera, with the first image captured directly by the camera and the second by placing a transparent medium between the scene and the camera. A depth map of the scene is then recovered from the displacements of scene points in the images. Unlike other existing depth from refraction methods, our method does not require the knowledge of the pose and refractive index of the transparent medium, but can recover them directly from the input images. We hence call our method self-calibrating depth from refraction. Experimental results on both synthetic and real-world data are presented, which demonstrate the effectiveness of the proposed method.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"2 1","pages":"635-642"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73112717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diffusion runs low on persistence fast","authors":"Chao Chen, H. Edelsbrunner","doi":"10.1109/ICCV.2011.6126271","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126271","url":null,"abstract":"Interpreting an image as a function on a compact subset of the Euclidean plane, we get its scale-space by diffusion, spreading the image over the entire plane. This generates a 1-parameter family of functions alternatively defined as convolutions with a progressively wider Gaussian kernel. We prove that the corresponding 1-parameter family of persistence diagrams have norms that go rapidly to zero as time goes to infinity. This result rationalizes experimental observations about scale-space. We hope this will lead to targeted improvements of related computer vision methods.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"106 1","pages":"423-430"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76107376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carsten Stoll, N. Hasler, Juergen Gall, H. Seidel, C. Theobalt
{"title":"Fast articulated motion tracking using a sums of Gaussians body model","authors":"Carsten Stoll, N. Hasler, Juergen Gall, H. Seidel, C. Theobalt","doi":"10.1109/ICCV.2011.6126338","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126338","url":null,"abstract":"We present an approach for modeling the human body by Sums of spatial Gaussians (SoG), allowing us to perform fast and high-quality markerless motion capture from multi-view video sequences. The SoG model is equipped with a color model to represent the shape and appearance of the human and can be reconstructed from a sparse set of images. Similar to the human body, we also represent the image domain as SoG that models color consistent image blobs. Based on the SoG models of the image and the human body, we introduce a novel continuous and differentiable model-to-image similarity measure that can be used to estimate the skeletal motion of a human at 5–15 frames per second even for many camera views. In our experiments, we show that our method, which does not rely on silhouettes or training data, offers an good balance between accuracy and computational cost.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"416 1","pages":"951-958"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80105962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The medial feature detector: Stable regions from image boundaries","authors":"Yannis Avrithis, Konstantinos Rapantzikos","doi":"10.1109/ICCV.2011.6126436","DOIUrl":"https://doi.org/10.1109/ICCV.2011.6126436","url":null,"abstract":"We present a local feature detector that is able to detect regions of arbitrary scale and shape, without scale space construction. We compute a weighted distance map on image gradient, using our exact linear-time algorithm, a variant of group marching for Euclidean space. We find the weighted medial axis by extending residues, typically used in Voronoi skeletons. We decompose the medial axis into a graph representing image structure in terms of peaks and saddle points. A duality property enables reconstruction of regions using the same marching method. We greedily group regions taking both contrast and shape into account. On the way, we select regions according to our shape fragmentation factor, favoring those well enclosed by boundaries—even incomplete. We achieve state of the art performance in matching and retrieval experiments with reduced memory and computational requirements.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"24 1","pages":"1724-1731"},"PeriodicalIF":0.0,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86549554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}