{"title":"Exploring Remote Collaborative Tasks: the Impact of Avatar Representation on Dyadic Haptic Interactions in Shared Virtual Environments.","authors":"Genki Sasaki, Hiroshi Igarashi","doi":"10.1109/TVCG.2025.3580546","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3580546","url":null,"abstract":"<p><p>This study is the first to explore the interplay between haptic interaction and avatar representation in Shared Virtual Environments (SVEs). Specifically, how these factors shape users' sense of social presence during dyadic collaborations, while assessing potential effects on task performance. In a series of experiments, participants performed the collaborative task with haptic interaction under four avatar representation conditions: avatars of both participant and partner were displayed, only the participant's avatar was displayed, only the partner's avatar was displayed, and no avatars were displayed. The study finds that avatar representation, especially of the partner, significantly enhances the perception of social presence, which haptic interaction alone does not fully achieve. However, neither the presence nor the type of avatar representation impacts the task performance or participants' force effort of the task, suggesting that haptic interaction provides sufficient interaction cues for the execution of the task. These results underscore the significance of integrating both visual and haptic modalities to optimize remote collaboration experiences in virtual environments, ensuring effective communication and a strong sense of social presence.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144319060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donggang Jia, Alexandra Irger, Lonni Besancon, Ondrej Strnad, Deng Luo, Johanna Bjorklund, Alexandre Kouyoumdjian, Anders Ynnerman, Ivan Viola
{"title":"VOICE: Visual Oracle for Interaction, Conversation, and Explanation.","authors":"Donggang Jia, Alexandra Irger, Lonni Besancon, Ondrej Strnad, Deng Luo, Johanna Bjorklund, Alexandre Kouyoumdjian, Anders Ynnerman, Ivan Viola","doi":"10.1109/TVCG.2025.3579956","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3579956","url":null,"abstract":"<p><p>We present VOICE, a novel approach to science communication that connects large language models' conversational capabilities with interactive exploratory visualization. VOICE introduces several innovative technical contributions that drive our conversational visualization framework. Based on the collected design requirements, we introduce a two-layer agent architecture that can perform task assignment, instruction extraction, and coherent content generation. We employ fine-tuning and prompt engineering techniques to tailor agents' performance to their specific roles and accurately respond to user queries. Our interactive text-to-visualization method generates a flythrough sequence matching the content explanation. In addition, natural language interaction provides capabilities to navigate and manipulate 3D models in real-time. The VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with a corresponding visual representation, with low latency and high accuracy. We demonstrate the effectiveness of our approach by implementing a proof-of-concept prototype and applying it to the molecular visualization domain: analyzing three 3D molecular models with multiscale and multi-instance attributes. Finally, we conduct a comprehensive evaluation of the system, including quantitative and qualitative analyses on our collected dataset, along with a detailed public user study and expert interviews. The results confirm that our framework and prototype effectively meet the design requirements and cater to the needs of diverse target users. All supplemental materials are available at https://osf.io/g7fbr.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144311112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiang Xin, Xiaonan Fang, Xueling Zhu, Ju Ren, Yaoxue Zhang
{"title":"$C^{2}D$: Context-aware Concept Decomposition for Personalized Text-to-image Synthesis.","authors":"Jiang Xin, Xiaonan Fang, Xueling Zhu, Ju Ren, Yaoxue Zhang","doi":"10.1109/TVCG.2025.3579776","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3579776","url":null,"abstract":"<p><p>Concept decomposition is a technique for personalized text-to-image synthesis which learns textual embeddings of subconcepts from images that depicting an original concept. The learned subconcepts can then be composed to create new images. However, existing methods fail to address the issue of contextual conflicts when subconcepts from different sources are combined because contextual information remains encapsulated within the subconcept embeddings. To tackle this problem, we propose a Context-aware Concept Decomposition ($C^{2}D$) framework. Specifically, we introduce a Similarity-Guided Divergent Embedding (SGDE) method to obtain subconcept embeddings. Then, we eliminate the latent contextual dependence between the subconcept embeddings and reconstruct the contextual information using an independent contextual embedding. This independent context can be combined with various subconcepts, enabling more controllable text-to-image synthesis based on subconcept recombination. Extensive experimental results demonstrate that our method outperforms existing approaches in both image quality and contextual consistency.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144311111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visualizationary: Automating Design Feedback for Visualization Designers Using LLMs.","authors":"Sungbok Shin, Sanghyun Hong, Niklas Elmqvist","doi":"10.1109/TVCG.2025.3579700","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3579700","url":null,"abstract":"<p><p>Interactive visualization editors empower users to author visualizations without writing code, but do not provide guidance on the art and craft of effective visual communication. In this paper, we explore the potential of using an off-the-shelf large language models (LLMs) to provide actionable and customized feedback to visualization designers. Our implementation, Visualizationary, demonstrates how ChatGPT can be used for this purpose through two key components: a preamble of visualization design guidelines and a suite of perceptual filters that extract salient metrics from a visualization image. We present findings from a longitudinal user study involving 13 visualization designers-6 novices, 4 intermediates, and 3 experts-who authored a new visualization from scratch over several days. Our results indicate that providing guidance in natural language via an LLM can aid even seasoned designers in refining their visualizations. All our supplemental materials are available at https://osf.io/v7hu8.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144289778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kirsten W H Maas, Danny Ruijters, Anna Vilanova, Nicola Pezzotti
{"title":"NeRF-CA: Dynamic Reconstruction of X-ray Coronary Angiography with Extremely Sparse-views.","authors":"Kirsten W H Maas, Danny Ruijters, Anna Vilanova, Nicola Pezzotti","doi":"10.1109/TVCG.2025.3579162","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3579162","url":null,"abstract":"<p><p>Dynamic three-dimensional (4D) reconstruction from two-dimensional X-ray coronary angiography (CA) remains a significant clinical problem. Existing CA reconstruction methods often require extensive user interaction or large training datasets. Recently, Neural Radiance Field (NeRF) has successfully reconstructed high-fidelity scenes in natural and medical contexts without these requirements. However, challenges such as sparse-views, intra-scan motion, and complex vessel morphology hinder its direct application to CA data. We introduce NeRF-CA, a first step toward a fully automatic 4D CA reconstruction that achieves reconstructions from sparse coronary angiograms. To the best of our knowledge, we are the first to address the challenges of sparse-views and cardiac motion by decoupling the scene into the moving coronary artery and the static background, effectively translating the problem of motion into a strength. NeRF-CA serves as a first stepping stone for solving the 4D CA reconstruction problem, achieving adequate 4D reconstructions from as few as four angiograms, as required by clinical practice, while significantly outperforming state-of-the-art sparse-view X-ray NeRF. We validate our approach quantitatively and qualitatively using representative 4D phantom datasets and ablation studies. To accelerate research in this domain, we made our codebase public: https://github.com/kirstenmaas/NeRF-CA.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144287653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henry Forster, Felix Klesen, Tim Dwyer, Peter Eades, Seok-Hee Hong, Stephen Kobourov, Giuseppe Liotta, Kazuo Misue, Fabrizio Montecchiani, Alexander Pastukhov, Falk Schreiber
{"title":"GRAPHTRIALS: Visual Proofs of Graph Properties.","authors":"Henry Forster, Felix Klesen, Tim Dwyer, Peter Eades, Seok-Hee Hong, Stephen Kobourov, Giuseppe Liotta, Kazuo Misue, Fabrizio Montecchiani, Alexander Pastukhov, Falk Schreiber","doi":"10.1109/TVCG.2025.3577533","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3577533","url":null,"abstract":"<p><p>Graph and network visualization supports exploration, analysis and communication of relational data arising in many domains: from biological and social networks, to transportation and powergrid systems. With the arrival of AI based question-answering tools, issues of trustworthiness and explainability of generated answers motivate a significant new role for visualization. In the context of graphs, we see the need for visualizations that can convince a critical audience that an assertion (e. g., from an AI) about the graph under analysis is valid. The requirements for such representations that convey precisely one specific graph property are quite different from standard network visualization criteria which optimize general aesthetics and readability. In this paper, we aim to provide a comprehensive introduction to visual proofs of graph properties and a foundation for further research in the area. We present a framework that defines what it means to visually prove a graph property. In the process, we introduce the notion of a visual certificate, that is, a specialized faithful graph visualization that leverages the viewer's perception, in particular, pre-attentive processing (e. g., via pop-out effects), verify to a given assertion about the represented graph. We also discuss the relationships between visual complexity, cognitive load and complexity theory, and propose a classification based on visual proof complexity. Then, we provide further examples of visual certificates for problems in different visual proof complexity classes. Finally, we conclude the paper with a discussion of the limitations of our model and some open problems.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diverse Code Query Learning for Speech-Driven Facial Animation.","authors":"Chunzhi Gu, Shigeru Kuriyama, Katsuya Hotta","doi":"10.1109/TVCG.2025.3577807","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3577807","url":null,"abstract":"<p><p>Speech-driven facial animation aims to synthesize lip-synchronized 3D talking faces following the given speech signal. Prior methods to this task mostly focus on pursuing realism with deterministic systems, yet characterizing the potentially stochastic nature of facial motions has been to date rarely studied. While generative modeling approaches can easily handle the one-to-many mapping by repeatedly drawing samples, ensuring a diverse mode coverage of plausible facial motions on small-scale datasets remains challenging and less explored. In this paper, we propose predicting multiple samples conditioned on the same audio signal and then explicitly encouraging sample diversity to address diverse facial animation synthesis. Our core insight is to guide our model to explore the expressive facial latent space with a diversity-promoting loss such that the desired latent codes for diversification can be ideally identified. To this end, building upon the rich facial prior learned with vector-quantized variational auto-encoding mechanism, our model temporally queries multiple stochastic codes which can be flexibly decoded into a diverse yet plausible set of speech-faithful facial motions. To further allow for control over different facial parts during generation, the proposed model is designed to predict different facial portions of interest in a sequential manner, and compose them to eventually form full-face motions. Our paradigm realizes both diverse and controllable facial animation synthesis in a unified formulation. We experimentally demonstrate that our method yields state-of-the-art performance both quantitatively and qualitatively, especially regarding sample diversity.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Non-Local Point Cloud Denoising Using Curvature Entropy and $gamma$-Norm Minimization.","authors":"Jian Chen, Feng Gao, Pingping Chen, Weisi Lin","doi":"10.1109/TVCG.2025.3577915","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3577915","url":null,"abstract":"<p><p>Non-local similarity (NLS) has been successfully applied to point cloud denoising. However, existing non-local methods either involve high algorithmic complexity in capturing NLS or suffer from diminished accuracy in estimating low-rank matrices. To address these problems, we propose a Point Cloud Denoising framework using $gamma$-norm minimization based on Curvature Entropy (PCD-$gamma$CE) for efficiently removing noise. First, we develop a structure descriptor, which exploits Curvature Entropy (CE) to accurately capture shape variation details of Non-Local Similar Structure (NLSS), and employs Angle Subdivision (AS) of NLSS to control the complexity of initial normal matrix construction. Second, we introduce $gamma$-norm to construct a low-rank denoising model for initial normal matrix, thereby providing a nearly unbiased estimation of rank function with better robustness to noise. Extensive experiments on synthetic and raw scanned point clouds show that our approach outperforms the popular denoising methods, with a 99.90% time reduction and gains in Mean Square Error (MSE) and Chamfer Distance (CD) compared with the Weighted Nuclear Norm Minimization (WNNM) method. The code will be available soon at https://github.com/fancj2017/PCD-rCE.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiming Hu, Guanhua Zhang, Zheming Yin, Daniel Haufle, Syn Schmitt, Andreas Bulling
{"title":"HaHeAE: Learning Generalisable Joint Representations of Human Hand and Head Movements in Extended Reality.","authors":"Zhiming Hu, Guanhua Zhang, Zheming Yin, Daniel Haufle, Syn Schmitt, Andreas Bulling","doi":"10.1109/TVCG.2025.3576999","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3576999","url":null,"abstract":"<p><p>Human hand and head movements are the most pervasive input modalities in extended reality (XR) and are significant for a wide range of applications. However, prior works on hand and head modelling in XR only explored a single modality or focused on specific applications. We present HaHeAE - a novel self-supervised method for learning generalisable joint representations of hand and head movements in XR. At the core of our method is an autoencoder (AE) that uses a graph convolutional network-based semantic encoder and a diffusion-based stochastic encoder to learn the joint semantic and stochastic representations of hand-head movements. It also features a diffusion-based decoder to reconstruct the original signals. Through extensive evaluations on three public XR datasets, we show that our method 1) significantly outperforms commonly used self-supervised methods by up to 74.1% in terms of reconstruction quality and is generalisable across users, activities, and XR environments, 2) enables new applications, including interpretable hand-head cluster identification and variable hand-head movement generation, and 3) can serve as an effective feature extractor for downstream tasks. Together, these results demonstrate the effectiveness of our method and underline the potential of self-supervised methods for jointly modelling hand-head behaviours in extended reality.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144236288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}