F. Poiesi, Alex Locher, P. Chippendale, E. Nocerino, Fabio Remondino, L. Gool
{"title":"Cloud-based collaborative 3D reconstruction using smartphones","authors":"F. Poiesi, Alex Locher, P. Chippendale, E. Nocerino, Fabio Remondino, L. Gool","doi":"10.1145/3150165.3150166","DOIUrl":"https://doi.org/10.1145/3150165.3150166","url":null,"abstract":"This article presents a pipeline that enables multiple users to collaboratively acquire images with monocular smartphones and derive a 3D point cloud using a remote reconstruction server. A set of key images are automatically selected from each smartphone's camera video feed as multiple users record different viewpoints of an object, concurrently or at different time instants. Selected images are automatically processed and registered with an incremental Structure from Motion (SfM) algorithm in order to create a 3D model. Our incremental SfM approach enables on-the-fly feedback to the user to be generated about current reconstruction progress. Feedback is provided in the form of a preview window showing the current 3D point cloud, enabling users to see if parts of a surveyed scene need further attention/coverage whilst they are still in situ. We evaluate our 3D reconstruction pipeline by performing experiments in uncontrolled and unconstrained real-world scenarios. Datasets are publicly available.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129049845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"User Interaction for Image Recolouring using £2","authors":"M. Grogan, Rozenn Dahyot, A. Smolic","doi":"10.1145/3150165.3150171","DOIUrl":"https://doi.org/10.1145/3150165.3150171","url":null,"abstract":"Recently, an example based colour transfer approach proposed modelling the colour distributions of a palette and target image using Gaussian Mixture Models, and registers them by minimising the robust £2 distance between the mixtures. In this paper we propose to extend this approach to allow for user interaction. We present two interactive recolouring applications, the first allowing the user to select colour correspondences between a target and palette image, while the second palette based application allows the user to edit a palette of colours to determine the image recolouring. We modify the £2 based cost function to improve results when an interactive interface is used, and take measures to ensure that even when minimal input is given by the user, good colour transfer results are created. Both applications are available through a web interface and qualitatively assessed against recent recolouring techniques.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127776100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CRF-net: Single Image Radiometric Calibration using CNNs","authors":"Han Li, P. Peers","doi":"10.1145/3150165.3150170","DOIUrl":"https://doi.org/10.1145/3150165.3150170","url":null,"abstract":"In this paper we present CRF-net, a CNN-based solution for estimating the camera response function from a single photograph. We follow the recent trend of using synthetic training data, and generate a large set of training pairs based on a small set of radio-metrically linear images and the DoRF database of camera response functions. The resulting CRF-net estimates the parameters of the EMoR camera response model directly from a single photograph. Experimentally, we show that CRF-net is able to accurately recover the camera response function from a single photograph under a wide range of conditions.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125771451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time rendering of realistic surface diffraction with low rank factorisation","authors":"Antoine Toisoul, A. Ghosh","doi":"10.1145/3150165.3150167","DOIUrl":"https://doi.org/10.1145/3150165.3150167","url":null,"abstract":"We propose a novel approach for real-time rendering of diffraction effects in surface reflectance in arbitrary environments. Such renderings are usually extremely expensive as they require the computation of a convolution at real-time framerates. In the case of diffraction, the diffraction lobes usually have high frequency details that can only be captured with high resolution convolution kernels which make calculations even more expensive. Our method uses a low rank factorisation of the diffraction lookup table to approximate a 2D convolution kernel by two simpler low rank kernels which allow the computation of the convolution at real-time framerates using two rendering passes. We show realistic renderings in arbitrary environments and achieve a performance from 50 to 100 FPS making possible to use such a technique in real-time applications such as video games and VR.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130727357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaewon Kim, Gyuchull Han, Hwasup Lim, S. Izadi, A. Ghosh
{"title":"ThirdLight: Low-cost and High-speed 3D Interaction Using Photosensor Markers","authors":"Jaewon Kim, Gyuchull Han, Hwasup Lim, S. Izadi, A. Ghosh","doi":"10.1145/3150165.3150169","DOIUrl":"https://doi.org/10.1145/3150165.3150169","url":null,"abstract":"We present a low-cost 3D tracking system for virtual reality, gesture modeling, and robot manipulation applications which require fast and precise localization of headsets, data gloves, props, or controllers. Our system removes the need for cameras or projectors for sensing, and instead uses cheap LEDs and printed masks for illumination, and low-cost photosensitive markers. The illumination device transmits a spatiotemporal pattern as a series of binary Gray-code patterns. Multiple illumination devices can be combined to localize each marker in 3D at high speed (333Hz). Our method has strengths in accuracy, speed, cost, ambient performance, large working space (1m-5m) and robustness to noise compared with conventional techniques. We compare with a state-of-the-art instrumented glove and vision-based systems to demonstrate the accuracy, scalability, and robustness of our approach. We propose a fast and accurate method for hand gesture modeling using an inverse kinematics approach with the six photosensitive markers. We additionally propose a passive markers system and demonstrate various interaction scenarios as practical applications.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131932079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Garoe Dorta, S. Vicente, L. Agapito, N. Campbell, S. Prince, Ivor J. A. Simpson
{"title":"Laplacian Pyramid of Conditional Variational Autoencoders","authors":"Garoe Dorta, S. Vicente, L. Agapito, N. Campbell, S. Prince, Ivor J. A. Simpson","doi":"10.1145/3150165.3150172","DOIUrl":"https://doi.org/10.1145/3150165.3150172","url":null,"abstract":"Variational Autoencoders (VAE) learn a latent representation of image data that allows natural image generation and manipulation. However, they struggle to generate sharp images. To address this problem, we propose a hierarchy of VAEs analogous to a Laplacian pyramid. Each network models a single pyramid level, and is conditioned on the coarser levels. The Laplacian architecture allows for novel image editing applications that take advantage of the coarse to fine structure of the model. Our method achieves lower reconstruction error in terms of MSE, which is the loss function of the VAE and is not directly minimised in our model. Furthermore, the reconstructions generated by the proposed model are preferred over those from the VAE by human evaluators.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132303396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Saliency-Based Sharpness Mismatch Detection For Stereoscopic Omnidirectional Images","authors":"S. Croci, S. Knorr, A. Smolic","doi":"10.1145/3150165.3150168","DOIUrl":"https://doi.org/10.1145/3150165.3150168","url":null,"abstract":"In this paper, we present a novel sharpness mismatch detection (SMD) approach for stereoscopic omnidirectional images (ODI) for quality control within the post-production workflow, which is the main contribution. In particular, we applied a state of the art SMD approach, which was originally developed for traditional HD images, and extended it to stereoscopic ODIs. A new efficient method for patch extraction from ODIs was developed based on the spherical Voronoi diagram of evenly distributed points on the sphere. The subdivision of the ODI into patches allows an accurate detection and localization of regions with sharpness mismatch. A second contribution of the paper is the integration of saliency into our SMD approach. In this context, we introduce a novel method for the estimation of saliency maps from viewport data of head-mounted displays (HMD). Finally, we demonstrate the performance of our SMD approach with data collected from a subjective test with 17 participants.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125430114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabio Zünd, Steven Poulakos, Mubbasir Kapadia, R. Sumner
{"title":"Story Version Control and Graphical Visualization for Collaborative Story Authoring","authors":"Fabio Zünd, Steven Poulakos, Mubbasir Kapadia, R. Sumner","doi":"10.1145/3150165.3150175","DOIUrl":"https://doi.org/10.1145/3150165.3150175","url":null,"abstract":"This paper presents a story version control and graphical visualization framework to enhance collaborative story authoring. We propose a media-agnostic story representation based on story beats, events, and participants that describes the flow of events in a storyline. We develop tree edit distance operations for this representation and use them to build the core features for story version control, including visual diff, conflict detection, and conflict resolution using three-way merge. Our system allows authors to work independently on the same story while providing the ability to automatically synchronize their efforts and resolve conflicts that may arise. We further enhance the collaborative authoring process using visualizations derived from the version control database that visually encode relationships between authors, characters, and story elements, during the evolution of the narrative. We demonstrate the efficacy of our system by integrating it within an existing visual storyboarding tool for authoring animated stories, and additionally use it to collaboratively author stories using video and images. We evaluate the usability of our system though two user studies. Our results reveal that untrained users are able to use and benefit from our system. Additionally, users are able to correctly interpret the graphical visualizations and perceive it to benefit collaboration during the story authoring process.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117300480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seam-hiding for Looping Videos","authors":"James Durrant, G. Brostow","doi":"10.1145/3150165.3152766","DOIUrl":"https://doi.org/10.1145/3150165.3152766","url":null,"abstract":"The proposed algorithm creates a seamless looping video clip from a real world video of an almost-cyclic motion. For a video that has repeating motion, such as a person on a trampoline, the first and last video frames may not precisely line up, even though the content is very similar. Playing back the video in a looping fashion can cause the re-start transition to jump out and appear discontinuous, both spatially and in terms of object velocity. Most work on video looping has sought to find the best re-set point in a longer video's timeline, but we start there, and modify the frames to hide the jump point. Our approach essentially fits a curve to the (x, y) and RGB coordinates of points in the scene, and then smooths those curves using gradient domain optimisation. We address important qualitative factors, balancing smoothness against preservation of the original trajectories/curves. Our modular system also incorporates video stabilisation and inpainting, to cope with more dynamic videos. For most videos within our scope, we found that automatic seam-hiding is succesful. For the cases in which the proposed system cannot satisfactorily produce a seamless loop, we hope our framework can be modified with improved components to achieve better results in the future.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133252125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Method for Efficient CPU-GPU Streaming for Walkthrough of Full Motion Lightfield Video","authors":"Floyd M. Chitalu, Babis Koniaris, Kenny Mitchell","doi":"10.1145/3150165.3150173","DOIUrl":"https://doi.org/10.1145/3150165.3150173","url":null,"abstract":"Lightfield video, as a high-dimensional function, is very demanding in terms of storage. As such, lightfield video data, even in a compressed form, do not typically fit in GPU or main memory unless the capture area, resolution or duration is sufficiently small. Additionally, latency minimization--critical for viewer comfort in use-cases such as virtual reality--places further constraints in many compression schemes. In this paper, we propose a scalable method for streaming lightfield video, parameterized on viewer location and time, that efficiently handles RAM-to-GPU memory transfers of lightfield video in a compressed form, utilizing the GPU architecture for reduction of latency. We demonstrate the effectiveness of our method in a variety of compressed animated lightfield datasets.","PeriodicalId":412591,"journal":{"name":"Proceedings of the 14th European Conference on Visual Media Production (CVMP 2017)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125399564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}