A. Albarelli, E. Rodolà, Filippo Bergamasco, A. Torsello
{"title":"A Non-cooperative Game for 3D Object Recognition in Cluttered Scenes","authors":"A. Albarelli, E. Rodolà, Filippo Bergamasco, A. Torsello","doi":"10.1109/3DIMPVT.2011.39","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.39","url":null,"abstract":"During the last few years a wide range of algorithms and devices have been made available to easily acquire range images. To this extent, the increasing abundance of depth data boosts the need for reliable and unsupervised analysis techniques, spanning from part registration to automated segmentation. In this context, we focus on the recognition of known objects in cluttered and incomplete 3D scans. Fitting a model to a scene is a very important task in many scenarios such as industrial inspection, scene understanding and even gaming. For this reason, this problem has been extensively tackled in literature. Nevertheless, while many descriptor-based approaches have been proposed, a number of hurdles still hinder the use of global techniques. In this paper we try to offer a different perspective on the topic. Specifically, we adopt an evolutionary selection algorithm in order to extend the scope of local descriptors to satisfy global pair wise constraints. In addition, the very same technique is also used to shift from an initial sparse correspondence to a dense matching. This leads to a novel pipeline for 3D object recognition, which is validated with an extensive set of experiments and comparisons with recent well-known feature-based approaches.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116740659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlo Dal Mutto, P. Zanuttigh, G. Cortelazzo, S. Mattoccia
{"title":"Scene Segmentation Assisted by Stereo Vision","authors":"Carlo Dal Mutto, P. Zanuttigh, G. Cortelazzo, S. Mattoccia","doi":"10.1109/3DIMPVT.2011.16","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.16","url":null,"abstract":"Stereo vision systems for 3D reconstruction have been deeply studied and are nowadays capable to provide a reasonably accurate estimate of the 3D geometry of a framed scene. They are commonly used to merely extract the 3D structure of the scene. However, a great variety of applications is not interested in the geometry itself, but rather in scene analysis operations, among which scene segmentation is a very important one. Classically, scene segmentation has been tackled by means of color information only, but it turns out to be a badly conditioned image processing operation which remains very challenging. This paper proposes a new framework for scene segmentation where color information is assisted by 3D geometry data, obtained by stereo vision techniques. This approach resembles in some way what happens inside our brain, where the two different views coming from the eyes are used to recognize the various object in the scene and by exploiting a pair of images instead of just one allows to greatly improve the segmentation quality and robustness. Clearly the performance of the approach is dependent on the specific stereo vision algorithm used in order to extract the geometry information. This paper investigates which stereo vision algorithms are best suited to this kind of analysis. Experimental results confirm the effectiveness of the proposed framework and allow to properly rank stereo vision systems on the basis of their performances when applied to the scene segmentation problem.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125715568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcel Germann, T. Popa, R. Ziegler, Richard Keiser, M. Gross
{"title":"Space-Time Body Pose Estimation in Uncontrolled Environments","authors":"Marcel Germann, T. Popa, R. Ziegler, Richard Keiser, M. Gross","doi":"10.1109/3DIMPVT.2011.38","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.38","url":null,"abstract":"We propose a data-driven, multi-view body pose estimation algorithm for video. It can operate in uncontrolled environments with loosely calibrated and low resolution cameras and without restricting assumptions on the family of possible poses or motions. Our algorithm first estimates a rough pose estimation using a spatial and temporal silhouette based search in a database of known poses. The estimated pose is improved in a novel pose consistency step acting locally on single frames and globally over the entire sequence. Finally, the resulting pose estimation is refined in a spatial and temporal pose optimization consisting of novel constraints to obtain an accurate pose. Our method proved to perform well on low resolution video footage from real broadcast of soccer games.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121986609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michihiro Kobayashi, Takahiro Okabe, Y. Matsushita, Yoichi Sato
{"title":"Surface Reconstruction in Photometric Stereo with Calibration Error","authors":"Michihiro Kobayashi, Takahiro Okabe, Y. Matsushita, Yoichi Sato","doi":"10.1109/3DIMPVT.2011.13","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.13","url":null,"abstract":"A method is described for surface reconstruction that accounts for the calibration errors in photometric stereo. The angular errors in calibrated light directions due to noise cause errors in surface normal estimates. Investigation of the effect of these calibration errors on the surface normals revealed that errors in the estimated light directions and in the surface normal estimates follow a Fisher distribution. By accounting for the Fisher noise in surface normals, the proposed method reconstructs a surface using maximum likelihood estimation. Extensive comparison with previous methods using synthetic and real scenes demonstrated that the proposed method outperforms them in the presence of calibration errors.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124639715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"6DoF Registration of 2D Laser Scans","authors":"B. Huhle, T. Schairer, A. Schilling, W. Straßer","doi":"10.1109/3DIMPVT.2011.26","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.26","url":null,"abstract":"We address the problem of registering a set of 2D laser scans in 3D space with regard to six degrees of freedom. Registering single 2D scans is only possible when making strong assumptions on the structure of the scene or on the acquisition process, since only a slice of the 3D environment is captured and the information content is very limited. With a combination of two differently oriented laser scanners, however, the registration problem becomes feasible. We present a method that is based on the idea of preserving the free space represented in each of these combined scans. On realistically simulated laser range data we show that, given a sufficient sampling density, the proposed algorithm is capable to recover from large translational and moderate rotational errors in the initial configuration.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129010441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and Accurate 3D Scanning Using Coded Phase Shifting and High Speed Pattern Projection","authors":"P. Wissmann, R. Schmitt, F. Forster","doi":"10.1109/3DIMPVT.2011.21","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.21","url":null,"abstract":"We describe a structured light phase measuring triangulation technique extending the conventional four-step phase shift sequence with embedded information suited to assist the phase unwrapping process. Using the embedded information, we perform automatic phase unwrapping in the presence of discontinuous or isolated surfaces without extending the length of the phase shift sequence or requiring stereo cameras. We demonstrate the application of the proposed method using a novel structured light projector capable of extraordinarily high projection frequencies and pattern resolution, as well as grayscale quantization. Using high speed cameras, we demonstrate 3D measurements at 20ms total acquisition time for both mono- and stereoscopic camera configurations.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126867590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Object Classification Using Range Images","authors":"Eunyoung Kim, G. Medioni","doi":"10.1109/3DIMPVT.2011.63","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.63","url":null,"abstract":"We present a novel scalable framework for free-form object classification in range images. The framework includes an automatic 3D object recognition system in range images and a scalable database structure to learn new instances and new categories efficiently. We adopt the TAX model, previously proposed for un-supervised object modeling in 2D images, to construct our hierarchical model of object classes from unlabelled range images. The hierarchical model embodies unorganized shape patterns of 3D objects in various classes in a tree structure with probabilistic distributions. A new visual vocabulary is introduced to represent a range image as a set of visual words for the process of hierarchical model inference, classification and online learning. We also propose an online learning algorithm that updates the hierarchical model efficiently thanks to the tree structure, when a new object should be learned into the model. Extensive experiments demonstrate average classification rates of 94% on a large synthetic dataset (1,350 training images and 450 test images for 9 object classes) and 88.4% on 1,433 depth images captured from real-time range sensors. We also show that our approach outperforms the original TAX method in terms of recall rate and stability.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114191695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiscale 3D Feature Extraction and Matching","authors":"Hadi Fadaifard, G. Wolberg","doi":"10.1109/3DIMPVT.2011.36","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.36","url":null,"abstract":"Partial 3D shape matching refers to the process of computing a similarity measure between partial regions of 3D objects. This remains a difficult challenge without emph{a priori} knowledge of the scale of the input objects, as well as their rotation and translation. This paper focuses on the problem of partial shape matching among 3D objects of unknown scale. We consider the problem of face detection on arbitrary 3D surfaces and introduce a multiscale surface representation for feature extraction and matching. This work is motivated by the scale-space theory for images. Scale-space based techniques have proven very successful for dealing with noise and scale changes in matching applications for 2D images. However, efficient and practical scale-space representations for 3D surfaces are lacking. Our proposed scale-space representation is defined in terms of the evolution of surface curvatures according to the heat equation. This representation is shown to be insensitive to noise, computationally efficient, and capable of automatic scale selection. Examples in face detection and surface registration are given.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116551850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstruction and Accurate Alignment of Feature Maps for Augmented Reality","authors":"Folker Wientapper, H. Wuest, Arjan Kuijper","doi":"10.1109/3DIMPVT.2011.25","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.25","url":null,"abstract":"This paper focuses on the preparative process of retrieving accurate feature maps for a camera-based tracking system. With this system it is possible to create ready-to use Augmented Reality applications with a very easy setup work-flow, which in practice only involves three steps: filming the object or environment from various viewpoints, defining a transformation between the reconstructed map and the target coordinate frame based on a small number of 3D-3D correspondences and, finally, initiating a feature learning and Bundle Adjustment step. Technically, the solution comprises several sub-algorithms. Given the image sequence provided by the user, a feature map is initially reconstructed and incrementally extended using a Simultaneous-Localization-and-Mapping (SLAM) approach. For the automatic initialization of the SLAM module, a method for detecting the amount of translation is proposed. Since the initially reconstructed map is defined in an arbitrary coordinate system, we present a method for optimally aligning the feature map to the target coordinated frame of the augmentation models based on 3D-3D correspondences defined by the user. As an initial estimate we solve for a rigid transformation with scaling, known as Absolute Orientation. For refinement of the alignment we present a modification of the well-known Bundle Adjustment, where we include these 3D-3D-correspondences as constraints. Compared to ordinary Bundle Adjustment we show that this leads to significantly more accurate reconstructions, since map deformations due to systematic errors such as small camera calibration errors or outliers are well compensated. This again results in a better alignment of the augmentations during run-time of the application, even in large-scale environments.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122733551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stereo Reconstruction of Building Interiors with a Vertical Structure Prior","authors":"Bernhard Zeisl, C. Zach, M. Pollefeys","doi":"10.1109/3DIMPVT.2011.53","DOIUrl":"https://doi.org/10.1109/3DIMPVT.2011.53","url":null,"abstract":"Image-based computation of a 3D map for an indoor environment is a very challenging task, but also a useful step for vision-based navigation and path planning for autonomous systems, and for efficient visualization of interior spaces. Since computational stereo is a highly ill-posed problem for the typically weakly textured, specular, and even sometimes transparent indoor environments, one has to incorporate very strong prior assumptions on the observed geometry. A natural assumption for building interiors is that open space is bounded (i) by parallel ground and ceiling planes, and (ii) by vertical (not necessarily orthogonal) wall elements. We employ this assumption as a strong prior in dense depth estimation from stereo images. The additional assumption of smooth vertical elements allows our approach to fill in plausible extensions of e.g. walls in case of (non-vertical) occlusions. It is also possible to explicitly detect non-vertical regions in the images, and to revert to more general stereo methods only in those areas. We demonstrate our method on several challenging stereo images of office environments.","PeriodicalId":330003,"journal":{"name":"2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131749564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}