{"title":"A Novel 3D-Unet Deep Learning Framework Based on High-Dimensional Bilateral Grid for Edge Consistent Single Image Depth Estimation","authors":"Mansi Sharma, Abheesht Sharma, Kadvekar Rohit Tushar, Avinash Panneer","doi":"10.1109/IC3D51119.2020.9376327","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376327","url":null,"abstract":"The task of predicting smooth and edge-consistent depth maps is notoriously difficult for single image depth estimation. This paper proposes a novel Bilateral Grid based 3D convolutional neural network, dubbed as 3DBG-UNet, that parameterize high dimensional feature space by encoding compact 3D bilateral grids with UNets and infers sharp geometric layout of the scene. Further, an another novel 3DBGES-UNet model is introduced that integrate 3DBG-UNet for inferring an accurate depth map given a single color view. The 3DBGES-UNet concatenate 3DBG-UNet geometry map with the inception network edge accentuation map and a spatial object's boundary map obtained by leveraging semantic segmentation and train the UNet model with ResNet backbone. Both models are designed with a particular attention to explicitly account for edges or minute details. Preserving sharp discontinuities at depth edges is critical for many applications such as realistic integration of virtual objects in AR video or occlusion-aware view synthesis for 3D display applications. The proposed depth prediction network achieves state-of-the-art performance in both qualitative and quantitative evaluations on the challenging NYUv2-Depth data. The code and corresponding pre-trained weights will be made publicly available.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129458331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Copyright notice]","authors":"","doi":"10.1109/ic3d51119.2020.9376333","DOIUrl":"https://doi.org/10.1109/ic3d51119.2020.9376333","url":null,"abstract":"","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114557546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quentin Bolsée, W. Darwish, Daniele Bonatto, G. Lafruit, A. Munteanu
{"title":"A Device for Capturing Inward-Looking Spherical Light Fields","authors":"Quentin Bolsée, W. Darwish, Daniele Bonatto, G. Lafruit, A. Munteanu","doi":"10.1109/IC3D51119.2020.9376346","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376346","url":null,"abstract":"We propose a calibration methodology for a novel type of inward-looking spherical light field acquisition device comprised of a moving CMOS camera with two angular degrees of freedom around an object. We designed a calibration cube covered with ChArUco markers to resolve viewpoint ambiguity. Our calibration model includes 20 unknowns describing the spherical parameters and the camera's geometrical and internal properties. These parameters are jointly optimized to reduce the reprojection error of the calibration cube's markers from multiple viewpoints, resulting in a model that can predict the pose of the camera from any other viewpoint. We successfully tested this calibrated model with a photogrammetry experiment followed by view synthesis of novel views using the resulting depth maps. Results show that the reconstructed image is highly accurate when compared to a real-life capturing of the same view.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131311541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Brac de la Perriere, V. Drazic, D. Doyen, Arno Schubert
{"title":"Synthesis of Computer Generated Holograms Using Layer-Based Method and Perspective Projection Images","authors":"Vincent Brac de la Perriere, V. Drazic, D. Doyen, Arno Schubert","doi":"10.1109/IC3D51119.2020.9376326","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376326","url":null,"abstract":"From its Nobel prize winning discovery by Denis Gabor over half a century ago, Holography has been alternately put in the spotlight as a promising technique for its capacity of displaying 3D scenes, to be later forgotten regarding the complexity of performing holographic recording outside from optical laboratories. The later development of high resolution microdis-plays, high capacity CPU and GPU units raised the interest of using Holography as a mean to reproduce dynamic 3D scenes through the computation of holograms and the display of the associated interference pattern on a Spatial Light Modulators (SLMs). The computation process can for example be performed from a layered Computer Generated (CG) scene, using well-known convolution methods or other transfer function models from Fourier optics. The simplicity and speed of this technique is unfortunately compensated by its limitation to small object and scenes. We here want to demonstrate the possibility to use similar techniques to compute holograms of larger field of view, using perspective-projection layered scene and well-known Fourier optics tools.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121217071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Dam, Abigail L. M. Webb, Liam Jarvis, P. Hibbard, M. Linley
{"title":"“The Mystery of the Raddlesham Mumps”: a Case Study for Combined Storytelling in a Theatre Play and Virtual Reality","authors":"L. Dam, Abigail L. M. Webb, Liam Jarvis, P. Hibbard, M. Linley","doi":"10.1109/IC3D51119.2020.9376391","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376391","url":null,"abstract":"“The Mystery of the Raddlesham Mumps” is a poem by Murray Lachlan Young, aimed at both children and adults. This poem has been adapted as a theatre play with a short prequel as a Virtual Reality (VR) / tablet app. We used this unique combination to explore the potential interaction between these different media elements for the level of “pres-ence” and “immersion” in the story (i.e. the level to which one can imagine oneself within the story at the expense of the sense of physical time and space). The theatre audience had the opportunity to play the VR / tablet app in the foyer before the performance started. After the performance, a questionnaire measured participants' level of immersion and presence in the theatre play and their enjoyment of both play and app. The results showed that people of all ages interacted with and liked the app. Ratings for the play were also high and did not depend on prior engagement with the app. However, the play was liked more by adults than children, and the reverse was true for the app, suggesting a potential generation shift in multimedia story telling.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131665591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural-Network-Based Detection Methods for Color, Sharpness, and Geometry Artifacts in Stereoscopic and VR180 Videos","authors":"S. Lavrushkin, Konstantin Kozhemyakov, D. Vatolin","doi":"10.1109/IC3D51119.2020.9376385","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376385","url":null,"abstract":"Shooting video in 3D format can introduce stereoscopic artifacts, potentially causing viewers visual discomfort. In this work, we consider three common stereoscopic artifacts: color mismatch, sharpness mismatch, and geometric distortion. This paper introduces two neural-network-based methods for simultaneous color- and sharpness-mismatch estimation, as well as for estimating geometric distortions. To train these networks we prepared large datasets based on frames from full-length stereoscopic movies and compared the results with methods that previously served in analyses of full-length stereoscopic movies. We used our proposed methods to analyze 100 videos in VR180 format-a new format for stereoscopic videos in virtual reality (VR). This work presents overall results for these videos along with several examples of detected problems.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"68 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133391731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Practical Approach for Microphone Array Calibration in Augmented and Virtual Reality Applications","authors":"Noman Akbar, G. Dickins, Mark R. P. Thomas","doi":"10.1109/IC3D51119.2020.9376386","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376386","url":null,"abstract":"Spatial sound is rapidly becoming an important part of augmented reality (AR) and virtual reality (VR) applications. Microphone calibration is essential for providing spatial sound capture and reproduction capabilities in AR/VR. This paper presents RAndom PerturbatIons for Diffuse-field (RAPID), a precise method for the diffuse field calibration of the magnitude responses of microphones. The performance of RAPID is compared against the ground truth obtained using a well-established method utilising a two-axis turntable. Notably, it is possible to perform calibration using RAPID in 30 seconds, whereas calibration using the turntable method can take up to 4 hours. RAPID achieves similar results to the ground truth with significantly less effort and setup requirements. A tool is also available as an open source MATLAB application for performing RAPID calibration with ±0.5 dB accuracy.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124624643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Calibration of a Multi-View Azure Kinect Scanner Based on Spatial Consistency","authors":"W. Darwish, Quentin Bolsée, A. Munteanu","doi":"10.1109/IC3D51119.2020.9376321","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376321","url":null,"abstract":"In this work, we introduce a new calibration method for a camera system comprising five Azure Kinect. The calibration method uses a ChArUco coded cube installed in the middle of the system. A new 3D optimization cost is proposed to overcome the IR camera noise and to enhance global 3D consistency of the captured model. The cost includes the repro-jection error and the point to plane distance. As a refinement stage, along with point to plane distance, a patch to plane distance is added in the cost to overcome the noise effect of the depth camera. The experimental results demonstrate that the proposed calibration method achieves a better reprojection error and more stable results in terms of standard deviation of the estimated pose compared to the state-of-the-art. In addition, the qualitative results show that the proposed method can produce a better registered point cloud compared to conventional calibration.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132339203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michel Bellemans, D. Lamrnens, J. D. Sloover, Tom De Vleeschauwer, Evarest Schoofs, W. Jordens, B. V. Steenhuyse, J. Mangelschots, S. Selleri, Charles Hamesse, Timothée Fréville, R. Haeltermani
{"title":"Training Firefighters in Virtual Reality","authors":"Michel Bellemans, D. Lamrnens, J. D. Sloover, Tom De Vleeschauwer, Evarest Schoofs, W. Jordens, B. V. Steenhuyse, J. Mangelschots, S. Selleri, Charles Hamesse, Timothée Fréville, R. Haeltermani","doi":"10.1109/IC3D51119.2020.9376336","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376336","url":null,"abstract":"Virtual Reality applications have become mainstream, not only for gamers, but also (to a lesser extent) for industry, where they can be used for training in settings that would be too dangerous, or too expensive, to replicate in real life. In this paper we give a description of a recent Virtual Reality application that was built in close collaboration between the Royal Military Academy, the Belgian Navy and industry to allow future fire-fighters to be trained in a virtual reproduction of a ship's quarters. Not only does this development serve to enhance the training syllabus but it also paves the way for future innovation according to the “Triple Helix” concept.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"89 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131676243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Unified Deep Learning Approach for Foveated Rendering & Novel View Synthesis from Sparse RGB-D Light Fields","authors":"Vineet Thumuluri, Mansi Sharma","doi":"10.1109/IC3D51119.2020.9376340","DOIUrl":"https://doi.org/10.1109/IC3D51119.2020.9376340","url":null,"abstract":"Near-eye light field displays provide a solution to visual discomfort when using head mounted displays by presenting accurate depth and focal cues. However, light field HMDs require rendering the scene from a large number of viewpoints. This computational challenge of rendering sharp imagery of the foveal region and reproduce retinal defocus blur that correctly drives accommodation is tackled in this paper. We designed a novel end-to-end convolutional neural network that leverages human vision to perform both foveated reconstruction and view synthesis using only 1.2% of the total light field data. The proposed architecture comprises of log-polar sampling scheme followed by an interpolation stage and a convolutional neural network. To the best of our knowledge, this is the first attempt that synthesizes the entire light field from sparse RGB-D inputs and simultaneously addresses foveation rendering for computational displays. Our algorithm achieves fidelity in the fovea without any perceptible artifacts in the peripheral regions. The performance in fovea is comparable to the state-of-the-art view synthesis methods, despite using around 10x less light field data.","PeriodicalId":159318,"journal":{"name":"2020 International Conference on 3D Immersion (IC3D)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131511159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}