{"title":"MM-NeRF: Multimodal-Guided 3D Multi-Style Transfer of Neural Radiance Field.","authors":"Zijiang Yang, Zhongwei Qiu, Chang Xu, Dongmei Fu","doi":"10.1109/TVCG.2024.3476331","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3476331","url":null,"abstract":"<p><p>3D style transfer aims to generate stylized views of 3D scenes with specified styles, which requires high-quality generating and keeping multi-view consistency. Existing methods still suffer the challenges of high-quality stylization with texture details and stylization with multimodal guidance. In this paper, we reveal that the common training method of stylization with NeRF, which generates stylized multi-view supervision by 2D style transfer models, causes the same object in supervision to show various states (color tone, details, etc.) in different views, leading NeRF to tend to smooth the texture details, further resulting in low-quality rendering for 3D multi-style transfer. To tackle these problems, we propose a novel Multimodal-guided 3D Multi-style transfer of NeRF, termed MM-NeRF. First, MM-NeRF projects multimodal guidance into a unified space to keep the multimodal styles consistency and extracts multimodal features to guide the 3D stylization. Second, a novel multi-head learning scheme is proposed to relieve the difficulty of learning multi-style transfer, and a multi-view style consistent loss is proposed to track the inconsistency of multi-view supervision data. Finally, a novel incremental learning mechanism is proposed to generalize MM-NeRF to any new style with small costs. Extensive experiments on several real-world datasets show that MM-NeRF achieves high-quality 3D multi-style stylization with multimodal guidance, and keeps multi-view consistency and style consistency between multimodal guidance.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixiang Zhuang, Baoping Cheng, Yao Cheng, Yuntao Jin, Renshuai Liu, Chengyang Li, Xuan Cheng, Jing Liao, Juncong Lin
{"title":"Learn2Talk: 3D Talking Face Learns from 2D Talking Face.","authors":"Yixiang Zhuang, Baoping Cheng, Yao Cheng, Yuntao Jin, Renshuai Liu, Chengyang Li, Xuan Cheng, Jing Liao, Juncong Lin","doi":"10.1109/TVCG.2024.3476275","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3476275","url":null,"abstract":"<p><p>The speech-driven facial animation technology is generally categorized into two main types: 3D and 2D talking face. Both of these have garnered considerable research attention in recent years. However, to our knowledge, the research into 3D talking face has not progressed as deeply as that of 2D talking face, particularly in terms of lip-sync and perceptual mouth movements. The lip-sync necessitates an impeccable synchronization between mouth motion and speech audio. The speech perception derived from the perceptual mouth movements should resemble that of the driving audio. To mind the gap between the two sub-fields, we propose Learn2Talk, a learning framework that enhances 3D talking face network by integrating two key insights from the field of 2D talking face. Firstly, drawing inspiration from the audio-video sync network, we develop a 3D sync-lip expert model for the pursuit of lip-sync between audio and 3D facial motions. Secondly, we utilize a teacher model, carefully chosen from among 2D talking face methods, to guide the training of the audio-to-3D motions regression network, thereby increasing the accuracy of 3D vertex movements. Extensive experiments demonstrate the superiority of our proposed framework over state-of-the-art methods in terms of lip-sync, vertex accuracy and perceptual movements. Finally, we showcase two applications of our framework: audio-visual speech recognition and speech-driven 3D Gaussian Splatting-based avatar animation. The project page of this paper is: https://lkjkjoiuiu.github.io/Learn2Talk/.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parametric Body Reconstruction Based on a Single Front Scan Point Cloud.","authors":"Xihang Li, Guiqin Li, Ming Li, Haoju Song","doi":"10.1109/TVCG.2024.3475414","DOIUrl":"10.1109/TVCG.2024.3475414","url":null,"abstract":"<p><p>Full-body 3D scanning simplifies the acquisition of digital body models. However, current systems are bulky, intricate, and costly, with strict clothing constraints. We propose a pipeline that combines inner body shape inference and parametric model registration for reconstructing the corresponding body model from a single front scan of a clothed body. Three networks modules (Scan2Front-Net, Front2Back-Net, and Inner2Corr-Net) with relatively independent functions are proposed for predicting front inner, back inner, and parametric model reference point clouds, respectively. We consider the back inner point cloud as an axial offset of the front inner point cloud and divide the body into 14 parts. This offset relationship is then learned within the same body parts to reduce the ambiguity of the inference. The predicted front and back inner point clouds are concatenated as inner body point cloud, and then reconstruction is achieved by registering the parametric body model through a point-to-point correspondence between the reference point cloud and the inner body point cloud. Qualitative and quantitative analysis show that the proposed method has significant advantages in terms of body shape completion and reconstruction body model accuracy.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142396312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RobustMap: Visual Exploration of DNN Adversarial Robustness in Generative Latent Space.","authors":"Jie Li, Jielong Kuang","doi":"10.1109/TVCG.2024.3471551","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3471551","url":null,"abstract":"<p><p>The paper presents a novel approach to visualizing adversarial robustness (called robustness below) of deep neural networks (DNNs). Traditional tests only return a value reflecting a DNN's overall robustness across a fixed number of test samples. Unlike them, we use test samples to train a generative model (GM) and render a DNN's robustness distribution over infinite generated samples within the GM's latent space. The approach extends test samples, enabling users to obtain new test samples to improve feature coverage constantly. Moreover, the distribution provides more information about a DNN's robustness, enabling users to understand a DNN's robustness comprehensively. We propose three methods to resolve the challenges of realizing the approach. Specifically, we (1) map a GM's high-dimensional latent space onto a plane with less information loss for visualization, (2) design a network to predict a DNN's robustness on massive samples to speed up the distribution rendering, and (3) develop a system to supports users to explore the distribution from multiple perspectives. Subjective and objective experiment results prove the usability and effectiveness of the approach.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan M Pieschacon, Maurizio Costabile, Andrew Cunningham, Joanne Zucco, Stewart Von Itzstein, Ross T Smith
{"title":"Smart Pipette: Elevating Laboratory Performance with Tactile Authenticity and Real-Time Feedback.","authors":"Juan M Pieschacon, Maurizio Costabile, Andrew Cunningham, Joanne Zucco, Stewart Von Itzstein, Ross T Smith","doi":"10.1109/TVCG.2024.3472837","DOIUrl":"10.1109/TVCG.2024.3472837","url":null,"abstract":"<p><p>Mastering the correct use of laboratory equipment is a fundamental skill for undergraduate science students involved in laboratory-based training. However, hands-on laboratory time is often limited, and remote students may struggle as their absence from the physical lab limits their skill development. An air-displacement micropipette was selected for our initial investigation, as accuracy and correct technique are essential in generating reliable assay data. Handling small liquid volumes demands hand dexterity and practice to achieve proficiency. This research assesses the importance of tactile authenticity during training by faithfully replicating the micropipette's key physical and operational characteristics. We developed a custom haptic training approach called 'Smart Pipette' which promotes accurate operation and enhances laboratory dexterity training. A comparative user study with 34 participants evaluated the effectiveness of the Smart Pipette custom haptic device against training with off-the-shelf hardware, specifically the Quest VR hand controller, which was chosen because it is held mid-air similar to a laboratory micropipette. Both training conditions are integrated with the same self-paced virtual simulation displayed on a computer screen, offering clear video instructions and realtime guidance. Results demonstrated that participants trained with the Smart Pipette custom haptic exhibited increased accuracy and precision while making fewer errors than those trained with off-the-shelf hardware. The Smart Pipette and the Quest VR controller had no significant differences in cognitive load and system usability scores. Tactile authentic interaction devices address challenges faced by online learners, while their applicability extends to traditional classrooms, where real-time feedback significantly enhances overall training performance outcomes.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Client-Designer Negotiation in Data Visualization Projects.","authors":"Elsie Lee-Robbins, Arran Ridley, Eytan Adar","doi":"10.1109/TVCG.2024.3467189","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3467189","url":null,"abstract":"<p><p>Data visualization designers and clients need to communicate effectively with each other to achieve a successful project. Unlike a personal or solo project, working with a client introduces a layer of complexity to the process. Client and designer might have different ideas about what is an acceptable solution that would satisfy the goals and constraints of the project. Thus, the client-designer relationship is an important part of the design process. To better understand the relationship, we conducted an interview study with 12 data visualization designers. We develop a model of a client-designer project space consisting of three aspects: surfacing project goals, agreeing on resource allocation, and creating a successful design. For each aspect, designer and client have their own mental model of how they envision the project. Disagreements between these models can be resolved by negotiation that brings them closer to alignment. We identified three main negotiation strategies to navigate the project space: 1) expanding the project space to consider more potential options, 2) constraining the project space to narrow in on the boundaries, and 3) shifting the project space to different options. We discuss client-designer collaboration as a negotiated relationship, with opportunities and challenges for each side. We suggest ways to mitigate challenges to avoid friction from developing into conflict.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyang Guo, Alex Kale, Matthew Kay, Jessica Hullman
{"title":"VMC: A Grammar for Visualizing Statistical Model Checks.","authors":"Ziyang Guo, Alex Kale, Matthew Kay, Jessica Hullman","doi":"10.1109/TVCG.2024.3456402","DOIUrl":"10.1109/TVCG.2024.3456402","url":null,"abstract":"<p><p>Visualizations play a critical role in validating and improving statistical models. However, the design space of model check visualizations is not well understood, making it difficult for authors to explore and specify effective graphical model checks. VMC defines a model check visualization using four components: (1) samples of distributions of checkable quantities generated from the model, including predictive distributions for new data and distributions of model parameters; (2) transformations on observed data to facilitate comparison; (3) visual representations of distributions; and (4) layouts to facilitate comparing model samples and observed data. We contribute an implementation of VMC as an R package. We validate VMC by reproducing a set of canonical model check examples, and show how using VMC to generate model checks reduces the edit distance between visualizations relative to existing visualization toolkits. The findings of an interview study with three expert modelers who used VMC highlight challenges and opportunities for encouraging exploration of correct, effective model check visualizations.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Language Models for Transforming Categorical Data to Interpretable Feature Vectors.","authors":"Karim Huesmann, Lars Linsen","doi":"10.1109/TVCG.2024.3460652","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3460652","url":null,"abstract":"<p><p>When analyzing heterogeneous data comprising numerical and categorical attributes, it is common to treat the different data types separately or transform the categorical attributes to numerical ones. The transformation has the advantage of facilitating an integrated multi-variate analysis of all attributes. We propose a novel technique for transforming categorical data into interpretable numerical feature vectors using Large Language Models (LLMs). The LLMs are used to identify the categorical attributes' main characteristics and assign numerical values to these characteristics, thus generating a multi-dimensional feature vector. The transformation can be computed fully automatically, but due to the interpretability of the characteristics, it can also be adjusted intuitively by an end user. We provide a respective interactive tool that aims to validate and possibly improve the AI-generated outputs. Having transformed a categorical attribute, we propose novel methods for ordering and color-coding the categories based on the similarities of the feature vectors.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaming Xie, Congyi Zhang, Guangshun Wei, Peng Wang, Guodong Wei, Wenxi Liu, Min Gu, Ping Luo, Wenping Wang
{"title":"Tooth Motion Monitoring in Orthodontic Treatment by Mobile Device-based Multi-view Stereo.","authors":"Jiaming Xie, Congyi Zhang, Guangshun Wei, Peng Wang, Guodong Wei, Wenxi Liu, Min Gu, Ping Luo, Wenping Wang","doi":"10.1109/TVCG.2024.3470992","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3470992","url":null,"abstract":"<p><p>Nowadays, orthodontics has become an important part of modern personal life to assist one in improving mastication and raising self-esteem. However, the quality of orthodontic treatment still heavily relies on the empirical evaluation of experienced doctors, which lacks quantitative assessment and requires patients to visit clinics frequently for in-person examination. To resolve the aforementioned problem, we propose a novel and practical mobile device-based framework for precisely measuring tooth movement in treatment, so as to simplify and strengthen the traditional tooth monitoring process. To this end, we formulate the tooth movement monitoring task as a multi-view multi-object pose estimation problem via different views that capture multiple texture-less and severely occluded objects (i.e. teeth). Specifically, we exploit a pre-scanned 3D tooth model and a sparse set of multi-view tooth images as inputs for our proposed tooth monitoring framework. After extracting tooth contours and localizing the initial camera pose of each view from the initial configuration, we propose a joint pose estimation scheme to precisely estimate the 3D pose of each individual tooth, so as to infer their relative offsets during treatment. Furthermore, we introduce the metric of Relative Pose Bias to evaluate the individual tooth pose accuracy in a small scale. We demonstrate that our approach is capable of reaching high accuracy and efficiency as practical orthodontic treatment monitoring requires.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wilson E Marcilio-Jr, Danilo M Eler, Fernando V Paulovich, Rafael M Martins
{"title":"HUMAP: Hierarchical Uniform Manifold Approximation and Projection.","authors":"Wilson E Marcilio-Jr, Danilo M Eler, Fernando V Paulovich, Rafael M Martins","doi":"10.1109/TVCG.2024.3471181","DOIUrl":"10.1109/TVCG.2024.3471181","url":null,"abstract":"<p><p>Dimensionality reduction (DR) techniques help analysts to understand patterns in high-dimensional spaces. These techniques, often represented by scatter plots, are employed in diverse science domains and facilitate similarity analysis among clusters and data samples. For datasets containing many granularities or when analysis follows the information visualization mantra, hierarchical DR techniques are the most suitable approach since they present major structures beforehand and details on demand. This work presents HUMAP, a novel hierarchical dimensionality reduction technique designed to be flexible on preserving local and global structures and preserve the mental map throughout hierarchical exploration. We provide empirical evidence of our technique's superiority compared with current hierarchical approaches and show a case study applying HUMAP for dataset labelling.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}