Bobby Bodenheimer, V. Popescu, J. Quarles, Lili Wang
{"title":"IEEE VR 2023 Message from the Program Chairs and Guest Editors","authors":"Bobby Bodenheimer, V. Popescu, J. Quarles, Lili Wang","doi":"10.1109/tvcg.2021.3067835","DOIUrl":"https://doi.org/10.1109/tvcg.2021.3067835","url":null,"abstract":"","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":"1 1","pages":"xiv-xv"},"PeriodicalIF":5.2,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/tvcg.2021.3067835","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41729172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Peng, Jun Hu, Jingtao Zhou, Xuan Gao, Ju-yong Zhang
{"title":"IntrinsicNGP: Intrinsic Coordinate based Hash Encoding for Human NeRF","authors":"Bo Peng, Jun Hu, Jingtao Zhou, Xuan Gao, Ju-yong Zhang","doi":"10.48550/arXiv.2302.14683","DOIUrl":"https://doi.org/10.48550/arXiv.2302.14683","url":null,"abstract":"Recently, many works have been proposed to utilize the neural radiance field for novel view synthesis of human performers. However, most of these methods require hours of training, making them difficult for practical use. To address this challenging problem, we propose IntrinsicNGP, which can train from scratch and achieve high-fidelity results in few minutes with videos of a human performer. To achieve this target, we introduce a continuous and optimizable intrinsic coordinate rather than the original explicit Euclidean coordinate in the hash encoding module of instant-NGP. With this novel intrinsic coordinate, IntrinsicNGP can aggregate inter-frame information for dynamic objects with the help of proxy geometry shapes. Moreover, the results trained with the given rough geometry shapes can be further refined with an optimizable offset field based on the intrinsic coordinate. Extensive experimental results on several datasets demonstrate the effectiveness and efficiency of IntrinsicNGP. We also illustrate our approach's ability to edit the shape of reconstructed subjects.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47324735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MoReVis: A Visual Summary for Spatiotemporal Moving Regions","authors":"Giovani Valdrighi, Nivan Ferreira, Jorge Poco","doi":"10.48550/arXiv.2302.13199","DOIUrl":"https://doi.org/10.48550/arXiv.2302.13199","url":null,"abstract":"Spatial and temporal interactions are central and fundamental in many activities in our world. A common problem faced when visualizing this type of data is how to provide an overview that helps users navigate efficiently. Traditional approaches use coordinated views or 3D metaphors like the Space-time cube to tackle this problem. However, they suffer from overplotting and often lack spatial context, hindering data exploration. More recent techniques, such as MotionRugs, propose compact temporal summaries based on 1D projection. While powerful, these techniques do not support the situation for which the spatial extent of the objects and their intersections is relevant, such as the analysis of surveillance videos or tracking weather storms. In this paper, we propose MoReVis, a visual overview of spatiotemporal data that considers the objects' spatial extent and strives to show spatial interactions among these objects by displaying spatial intersections. Like previous techniques, our method involves projecting the spatial coordinates to 1D to produce compact summaries. However, our solution's core consists of performing a layout optimization step that sets the size and positions of the visual marks on the summary to resemble the actual values on the original space. We also provide multiple interactive mechanisms to make interpreting the results more straightforward for the user. We perform an extensive experimental evaluation and usage scenarios. Moreover, we evaluated the usefulness of MoReVis in a study with 9 participants. The results point out the effectiveness and suitability of our method in representing different datasets compared to traditional techniques.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45632088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen-Yang Zhou, Lu Yuan, Shu-Yu Chen, Lin Gao, Shimin Hu
{"title":"LC-NeRF: Local Controllable Face Generation in Neural Randiance Field","authors":"Wen-Yang Zhou, Lu Yuan, Shu-Yu Chen, Lin Gao, Shimin Hu","doi":"10.48550/arXiv.2302.09486","DOIUrl":"https://doi.org/10.48550/arXiv.2302.09486","url":null,"abstract":"3D face generation has achieved high visual quality and 3D consistency thanks to the development of neural radiance fields (NeRF). However, these methods model the whole face as a neural radiance field, which limits the controllability of the local regions. In other words, previous methods struggle to independently control local regions, such as the mouth, nose, and hair. To improve local controllability in NeRF-based face generation, we propose LC-NeRF, which is composed of a Local Region Generators Module (LRGM) and a Spatial-Aware Fusion Module (SAFM), allowing for geometry and texture control of local facial regions. The LRGM models different facial regions as independent neural radiance fields and the SAFM is responsible for merging multiple independent neural radiance fields into a complete representation. Finally, LC-NeRF enables the modification of the latent code associated with each individual generator, thereby allowing precise control over the corresponding local region. Qualitative and quantitative evaluations show that our method provides better local controllability than state-of-the-art 3D-aware face generation methods. A perception study reveals that our method outperforms existing state-of-the-art methods in terms of image quality, face consistency, and editing effects. Furthermore, our method exhibits favorable performance in downstream tasks, including real image editing and text-driven facial image editing.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44963218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He
{"title":"Audio2Gestures: Generating Diverse Gestures from Audio","authors":"Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He","doi":"10.48550/arXiv.2301.06690","DOIUrl":"https://doi.org/10.48550/arXiv.2301.06690","url":null,"abstract":"People may perform diverse gestures affected by various mental and physical factors when speaking the same sentences. This inherent one-to-many relationship makes co-speech gesture generation from audio particularly challenging. Conventional CNNs/RNNs assume one-to-one mapping, and thus tend to predict the average of all possible target motions, easily resulting in plain/boring motions during inference. So we propose to explicitly model the one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code. The shared code is expected to be responsible for the motion component that is more correlated to the audio while the motion-specific code is expected to capture diverse motion information that is more independent of the audio. However, splitting the latent code into two parts poses extra training difficulties. Several crucial training losses/strategies, including relaxed motion loss, bicycle constraint, and diversity loss, are designed to better train the VAE. Experiments on both 3D and 2D motion datasets verify that our method generates more realistic and diverse motions than previous state-of-the-art methods, quantitatively and qualitatively. Besides, our formulation is compatible with discrete cosine transformation (DCT) modeling and other popular backbones (i.e. RNN, Transformer). As for motion losses and quantitative motion evaluation, we find structured losses/metrics (e.g. STFT) that consider temporal and/or spatial context complement the most commonly used point-wise losses (e.g. PCK), resulting in better motion dynamics and more nuanced motion details. Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2023-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42570159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NeRF-Art: Text-Driven Neural Radiance Fields Stylization","authors":"Can Wang, Ruixia Jiang, Menglei Chai, Mingming He, Dongdong Chen, Jing Liao","doi":"10.48550/arXiv.2212.08070","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08070","url":null,"abstract":"As a powerful representation of 3D scenes, the neural radiance field (NeRF) enables high-quality novel view synthesis from multi-view images. Stylizing NeRF, however, remains challenging, especially in simulating a text-guided style with both the appearance and the geometry altered simultaneously. In this paper, we present NeRF-Art, a text-guided NeRF stylization approach that manipulates the style of a pre-trained NeRF model with a simple text prompt. Unlike previous approaches that either lack sufficient geometry deformations and texture details or require meshes to guide the stylization, our method can shift a 3D scene to the target style characterized by desired geometry and appearance variations without any mesh guidance. This is achieved by introducing a novel global-local contrastive learning strategy, combined with the directional constraint to simultaneously control both the trajectory and the strength of the target style. Moreover, we adopt a weight regularization method to effectively suppress cloudy artifacts and geometry noises which arise easily when the density field is transformed during geometry stylization. Through extensive experiments on various styles, we demonstrate that our method is effective and robust regarding both single-view stylization quality and cross-view consistency. The code and more results can be found on our project page: https://cassiepython.github.io/nerfart/.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44448914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-An Liu, Chengjiang Long, Zhaoxuan Zhang, Bo Liu, Qiang Zhang, Baocai Yin, Xin Yang
{"title":"Explore Contextual Information for 3D Scene Graph Generation","authors":"Yu-An Liu, Chengjiang Long, Zhaoxuan Zhang, Bo Liu, Qiang Zhang, Baocai Yin, Xin Yang","doi":"10.48550/arXiv.2210.06240","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06240","url":null,"abstract":"3D scene graph generation (SGG) has been of high interest in computer vision. Although the accuracy of 3D SGG on coarse classification and single relation label has been gradually improved, the performance of existing works is still far from being perfect for fine-grained and multi-label situations. In this paper, we propose a framework fully exploring contextual information for the 3D SGG task, which attempts to satisfy the requirements of fine-grained entity class, multiple relation labels, and high accuracy simultaneously. Our proposed approach is composed of a Graph Feature Extraction module and a Graph Contextual Reasoning module, achieving appropriate information-redundancy feature extraction, structured organization, and hierarchical inferring. Our approach achieves superior or competitive performance over previous methods on the 3DSSG dataset, especially on the relationship prediction sub-task.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44168847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-User Redirected Walking in Separate Physical Spaces for Online VR Scenarios","authors":"Sen-Zhe Xu, Jia-Hong Liu, Miao Wang, Fang-Lue Zhang, Songhai Zhang","doi":"10.48550/arXiv.2210.05356","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05356","url":null,"abstract":"With the recent rise of Metaverse, online multiplayer VR applications are becoming increasingly prevalent worldwide. However, as multiple users are located in different physical environments, different reset frequencies and timings can lead to serious fairness issues for online collaborative/competitive VR applications. For the fairness of online VR apps/games, an ideal online RDW strategy must make the locomotion opportunities of different users equal, regardless of different physical environment layouts. The existing RDW methods lack the scheme to coordinate multiple users in different PEs, and thus have the issue of triggering too many resets for all the users under the locomotion fairness constraint. We propose a novel multi-user RDW method that is able to significantly reduce the overall reset number and give users a better immersive experience by providing a fair exploration. Our key idea is to first find out the \"bottleneck\" user that may cause all users to be reset and estimate the time to reset given the users' next targets, and then redirect all the users to favorable poses during that maximized bottleneck time to ensure the subsequent resets can be postponed as much as possible. More particularly, we develop methods to estimate the time of possibly encountering obstacles and the reachable area for a specific pose to enable the prediction of the next reset caused by any user. Our experiments and user study found that our method outperforms existing RDW methods in online VR applications.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46927440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TraInterSim: Adaptive and Planning-Aware Hybrid-Driven Traffic Intersection Simulation","authors":"Pei Lv, Xinming Pei, Xinyu Ren, Yuzhen Zhang, Chaochao Li, Mingliang Xu","doi":"10.48550/arXiv.2210.08118","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08118","url":null,"abstract":"Traffic intersections are important scenes that can be seen almost everywhere in the traffic system. Currently, most simulation methods perform well at highways and urban traffic networks. In intersection scenarios, the challenge lies in the lack of clearly defined lanes, where agents with various motion plannings converge in the central area from different directions. Traditional model-based methods are difficult to drive agents to move realistically at intersections without enough predefined lanes, while data-driven methods often require a large amount of high-quality input data. Simultaneously, tedious parameter tuning is inevitable involved to obtain the desired simulation results. In this paper, we present a novel adaptive and planning-aware hybrid-driven method (TraInterSim) to simulate traffic intersection scenarios. Our hybrid-driven method combines an optimization-based data-driven scheme with a velocity continuity model. It guides the agent's movements using real-world data and can generate those behaviors not present in the input data. Our optimization method fully considers velocity continuity, desired speed, direction guidance, and planning-aware collision avoidance. Agents can perceive others' motion plannings and relative distances to avoid possible collisions. To preserve the individual flexibility of different agents, the parameters in our method are automatically adjusted during the simulation. TraInterSim can generate realistic behaviors of heterogeneous agents in different traffic intersection scenarios in interactive rates. Through extensive experiments as well as user studies, we validate the effectiveness and rationality of the proposed simulation method.","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":" ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47313608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huijie Guo, Meijun Liu, Bowen Yang, Ye Sun, Huamin Qu, Lei Shi
{"title":"RankFIRST: Visual Analysis for Factor Investment By Ranking Stock Timeseries.","authors":"Huijie Guo, Meijun Liu, Bowen Yang, Ye Sun, Huamin Qu, Lei Shi","doi":"10.1109/TVCG.2022.3209414","DOIUrl":"10.1109/TVCG.2022.3209414","url":null,"abstract":"<p><p>In the era of quantitative investment, factor-based investing models are widely adopted in the construction of stock portfolios. These models explain the performance of individual stocks by a set of financial factors, e.g., market beta and company size. In industry, open investment platforms allow the online building of factor-based models, yet set a high bar on the engineering expertise of end-users. State-of-the-art visualization systems integrate the whole factor investing pipeline, but do not directly address domain users' core requests on ranking factors and stocks for portfolio construction. The current model lacks explainability, which downgrades its credibility with stock investors. To fill the gap in modeling, ranking, and visualizing stock time series for factor investment, we designed and implemented a visual analytics system, namely RankFIRST. The system offers built-in support for an established factor collection and a cross-sectional regression model viable for human interpretation. A hierarchical slope graph design is introduced according to the desired characteristics of good factors for stock investment. A novel firework chart is also invented extending the well-known candlestick chart for stock time series. We evaluated the system on the full-scale Chinese stock market data in the recent 30 years. Case studies and controlled user evaluation demonstrate the superiority of our system on factor investing, in comparison to both passive investing on stock indices and existing stock market visual analytics tools.</p>","PeriodicalId":13376,"journal":{"name":"IEEE Transactions on Visualization and Computer Graphics","volume":"PP ","pages":""},"PeriodicalIF":5.2,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9509274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}