{"title":"Stereoscopic Dataset from A Video Game: Detecting Converged Axes and Perspective Distortions in S3D Videos","authors":"K. Malyshev, S. Lavrushkin, D. Vatolin","doi":"10.1109/IC3D51119.2020.9376375","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376375","url":null,"abstract":"This paper presents a method for generating stereoscopic or multi-angle video frames using a computer game (Grand Theft Auto V). We developed a mod that captures synthetic frames allows us to create geometric distortions like those that occur in a real video. These distortions are the main cause of viewer discomfort when watching 3D movies. Datasets generated in this way can aid in solving problems related to machine-learning-based assessment of stereoscopic- or multi-angle-video quality. We trained a convolutional neural network to evaluate perspective distortions and converged camera axes in stereoscopic video, then tested it on real 3D movies. The neural network discovered multiple examples of these distortions.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132177728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine Perception Point Cloud Quality Assessment Via Vision Tasks","authors":"Jiapeng Lu, Linyao Gao, Wenjie Zhu, Yiling Xu","doi":"10.1109/IC3D51119.2020.9376344","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376344","url":null,"abstract":"LiDAR sensing is a newly developed 3D acquisition technology which is widely applied in auto-driving area. Different from the human perception point cloud, the generated 3D data is machine perception point clouds which are designed for specific vision tasks in realistic life, such as point cloud detection, segmentation and recognition. Therefore, instead of traditional subjective quality estimation, the quality assessment of machine perception point cloud is a new challenge. In this paper, we propose a machine perception point cloud quality assessment via various vision tasks, evaluating the point cloud quality based on the performance in vision tasks of different level of distorted point cloud. Firstly, we utilize the state-of-the-art point cloud compression algorithm to obtain the distorted point cloud. Then, we explore the potentials of distorted point clouds in detection and segmentation precision, comparing the results in different testing conditions. Finally, we propose the machine perception ROI based point cloud compression framework achieves notable performance on vision tasks result while do insignificant influence on PSNR.The experimental results illustrate the correspondence between point cloud quality and the performance in vision tasks, verifying the effectiveness of the proposed method.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131946665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Implications of Interpupillary Distance Variability for Virtual Reality","authors":"P. Hibbard, L. Dam, P. Scarfe","doi":"10.1109/IC3D51119.2020.9376369","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376369","url":null,"abstract":"Creating and presenting binocular images for virtual reality and other 3D displays needs to take account of the interpupillary distance-the distance between the user's eyes. While VR headsets allow for some adjustments of this setting, this does not accommodate the full range found in the population, and may not necessarily be accurately measured and adjusted in practice. A mismatch between the observer's IPD and that assumed in creating and presenting stimuli will tend to cause problems with viewing comfort and accurate depth perception. We identify unnatural eye fixations, visual discomfort and inaccurate depth perception as important considerations for understanding the suitability of VR for use by children. We present a geometrical quantification of each of these factors.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134286916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward Texturing for Immersive Modeling of Environment Reconstructed from 360 Multi-Camera","authors":"M. Lhuillier","doi":"10.1109/IC3D51119.2020.9376323","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376323","url":null,"abstract":"The computation of a textured 3D model of a scene using a camera has three steps: acquisition, reconstruction and texturing. The texturing is important for visualization applications since it removes visual artifacts due to inaccuracies of the reconstruction, varying photometric parameters of the camera and non-Lambertian scene. This paper presents the first texturing pipeline for an unfrequent but important case: the reconstruction of immersive 3D models of complete environments from images taken by a 360 multi-camera moving on the ground. We contribute in many ways: sky texturing (not done in previous work), estimation of gain and bias corrections, and seam leveling. All methods are designed to deal with ordered sequences of thousands of keyframes. In the experiments, we start from videos taken by biking during 25 minutes in a campus using a helmet-held Garmin Virb 360.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122468607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VST3D-Net:Video-Based Spatio-Temporal Network for 3D Shape Reconstruction from a Video","authors":"Jinglun Yang, Guanglun Zhang, Youhua Li, Lu Yang","doi":"10.1109/IC3D51119.2020.9376350","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376350","url":null,"abstract":"In this paper, we propose the Video-based Spatio-Temporal 3D Network (VST3D-Net), which is a novel learning approach of viewpoint-invariant 3D shape reconstruction from monocular video. In our VST3D-Net, a spatial feature extraction subnetwork is designed to encode the local and global spatial relationships of the object in the image. The extracted latent spatial features have implicitly embedded both shape and pose information. Although a single view can also be used to recover a 3D shape, more rich shape information of the dynamic object can be explored and leveraged from video frames. To generate the viewpoint-free 3D shape, we design a temporal correlation feature extractor. It handles the temporal consistency of the shape and pose of the moving object simultaneously. Therefore, both the canonical 3D shape and the corresponding pose at different frame are recovered by the network. We validate our approach on the ShapeNet-based video dataset and ApolloCar3D dataset. The experimental results show the proposed VST3D-Net can outperform the state-of-the-art approaches both in accuracy and efficiency.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129055965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}