{"title":"Real-time Translation of Upper-body Gestures to Virtual Avatars in Dissimilar Telepresence Environments.","authors":"Jiho Kang, Taehei Kim, Hyeshim Kim, Sung-Hee Lee","doi":"10.1109/TVCG.2025.3577156","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3577156","url":null,"abstract":"<p><p>In mixed reality (MR) avatar-mediated telepresence, avatar movement must be adjusted to convey the user's intent in a dissimilar space. This paper presents a novel neural network-based framework designed for translating upper-body gestures, which adjusts virtual avatar movements in dissimilar environments to accurately reflect the user's intended gestures in real-time. Our framework translates a wide range of upperbody gestures, including eye gaze, deictic gestures, free-form gestures, and the transitions between them. A key feature of our framework is its ability to generate natural upper-body gestures for users of different sizes, irrespective of handedness and eye dominance, even though the training is based on data from a single person. Unlike previous methods that require paired motion between users and avatars for training, our framework uses an unpaired approach, significantly reducing training time and allowing for generating a wider variety of motion types. These advantages were made possible by designing two separate networks: the Motion Progression Network, which interprets sparse tracking signals from the user to determine motion progression, and the Upper-body Gesture Network, which autoregressively generates the avatar's pose based on these progressions. We demonstrate the effectiveness of our framework through quantitative comparisons with state-of-the-art methods, qualitative animation results, and a user evaluation in MR telepresence scenarios.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144236289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"At the Peak: Empirical Patterns for Creating Climaxes in Data Videos.","authors":"Zheng Wei, Yuelu Li, Wenchuan Lu, Qiming Gu, Huamin Qu, Xian Xu","doi":"10.1109/TVCG.2025.3576597","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3576597","url":null,"abstract":"<p><p>We identify and evaluate 40 recurring design patterns for crafting emotionally resonant climaxes in data videos, advancing data-driven storytelling research. Despite the growing popularity of data videos, guidance on designing narrative climaxes that maximise viewers' emotional engagement remains scarce. To address this gap, our work leverages emotional theory to derive patterns for crafting emotionally resonant climaxes of data videos. We first analyzed the climaxes of 96 data videos, categorizing them into eight emotional dimensions based on Plutchik's basic emotion model. Based on data analysis, we then formulated 40 patterns for creating narrative climaxes. To evaluate the patterns when applied as design hints, we conducted a user study with 48 participants, where Group A created data video climaxes using our patterns, Group B created them without our patterns, and Group C used other patterns as the baseline. Evaluations by two experts and 20 general audiences revealed that the climaxes created with the patterns were more emotionally engaging. The participants also praised the clarity and practicality of the patterns.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144228063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames.","authors":"Yunfan Lu, Guoqiang Liang, Yiran Shen, Lin Wang","doi":"10.1109/TVCG.2025.3576305","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3576305","url":null,"abstract":"<p><p>Most consumer cameras use rolling shutter (RS) exposure, the captured videos often suffer from distortions (e.g., skew and jelly effect). Also, these videos are impeded by the limited bandwidth and frame rate, which inevitably affect the video streaming experience. In this paper, we excavate the potential of event cameras as they enjoy high temporal resolution. Accordingly, we propose a framework to recover the global shutter (GS) high frame rate (i.e., slow motion) video without RS distortion from an RS camera and event camera. One challenge is the lack of real-world datasets for supervised training. Therefore, we explore self-supervised learning with the key idea of estimating the displacement field-a non-linear and dense 3D spatiotemporal representation of all pixels during the exposure time. This allows for a mutual reconstruction between RS and GS frames and facilitates slow-motion video recovery. We then combine the input RS frames with the DF to map them to the GS frames (RS-to-GS). Given the under-constrained nature of this mapping, we integrate it with the inverse mapping (GS-to-RS) and RS frame warping (RS-to-RS) for self-supervision. We evaluate our framework via objective analysis (i.e., quantitative and qualitative comparisons on four datasets) and subjective studies (i.e., user study). The results show that our framework can recover slow-motion videos without distortion, with much lower bandwidth ($94%$ drop) and higher inference speed ($16ms/frame$) under $32 times$ frame interpolation. The dataset and source code are publicly available at: https://github.com/yunfanLu/Self-EvRSVFI.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models.","authors":"Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Haoyu Tian, Wei Zhang, Minfeng Zhu, Wei Chen","doi":"10.1109/TVCG.2025.3575694","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3575694","url":null,"abstract":"<p><p>The proliferation of large language models (LLMs) has underscored concerns regarding their security vulnerabilities, notably against jailbreak attacks, where adversaries design jailbreak prompts to circumvent safety mechanisms for potential misuse. Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses. However, the complexity of evaluating jailbreak performance and understanding prompt characteristics makes this analysis laborious. We collaborate with domain experts to characterize problems and propose an LLM-assisted framework to streamline the analysis process. It provides automatic jailbreak assessment to facilitate performance evaluation and support analysis of components and keywords in prompts. Based on the framework, we design JailbreakLens, a visual analysis system that enables users to explore the jailbreak performance against the target model, conduct multi-level analysis of prompt characteristics, and refine prompt instances to verify findings. Through a case study, technical evaluations, and expert interviews, we demonstrate our system's effectiveness in helping users evaluate model security and identify model weaknesses.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EGAvatar: Efficient GAN Inversion for Generalizable Head Avatar from Few-shot Images.","authors":"Hao-Pan Ren, Wei Duan, Wan-Yu Li, Yi Liu, Yu-Dong Guo, Shi-Sheng Huang, Ju-Yong Zhang, Hua Huang","doi":"10.1109/TVCG.2025.3575782","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3575782","url":null,"abstract":"<p><p>Controllable head avatar reconstruction via the inversion of few-shot images using 3D generative models has demonstrated significant potential for efficient avatar creation. However, under limited input conditions, existing one-shot inversion methods often fail to produce high-fidelity results, frequently leading to shape distortions, expression deviations, and identity inconsistencies. To address these limitations, we propose EGAvatar, a novel and efficient 3DGAN inversion framework designed to generate high-fidelity, generalizable head avatars from few-shot images. The core principle of EGAvatar is a decoupling-by-inverting strategy, built upon an animatable 3DGAN prior. Specifically, we introduce an effective animatable 3DGAN model that synthesizes high-quality 3D avatars by integrating a coarse 3D triplane representation (derived from a latent 3DGAN) with an offset 3D triplane (learned via a triplane 3DGAN). Leveraging this architecture, we design a 3DGAN-based inversion approach to reconstruct 3D avatars efficiently. Additionally, we incorporate an expression-view disentanglement mechanism to maintain consistent appearance across varying expressions and viewpoints, thereby enhancing the generalizability of avatar reconstruction from limited input images. Extensive experiments conducted on two publicly available benchmarks and a private dataset demonstrate that EGAvatar outperforms existing state-of-the-art methods in both qualitative and quantitative evaluations. Notably, EGAvatar achieves superior performance while requiring significantly fewer input images and offering more efficient training and inference.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laixin Xie, Chenyang Zhang, Ruofei Ma, Xingxing Xing, Wei Wan, Quan Li
{"title":"ASight: Fine-tuning Auto-Scheduling Optimizations for Model Deployment via Visual Analytics.","authors":"Laixin Xie, Chenyang Zhang, Ruofei Ma, Xingxing Xing, Wei Wan, Quan Li","doi":"10.1109/TVCG.2025.3574194","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3574194","url":null,"abstract":"<p><p>Upon completing the design and training phases, deploying a deep learning model to specific hardware becomes necessary prior to its implementation in practical applications. To enhance the performance of the model, the developers must optimize it to decrease inference latency. Auto-scheduling, an automated approach that generates optimization schemes, offers a feasible option for large-scale auto-deployment. Nevertheless, the low-level code generated by auto-scheduling closely resembles hardware coding and may present challenges for human comprehension, thereby hindering future manual optimization efforts. In this study, we introduce ASight, a visual analytics system to assist engineers in identifying performance bottlenecks, comprehending the auto-generated low-level code, and obtaining insights from auto-scheduling optimizations. We develop a subgraph matching algorithm capable of identifying graph isomorphism among Intermediate Representations to track performance bottlenecks from low-level metrics to high-level computational graphs. To address the substantial profiling metrics involved in auto-scheduling and derive optimization design principles by summarizing commonalities among auto-scheduling optimizations, we propose an enhanced visualization for the large search space of auto-scheduling. We validate the effectiveness of ASight through two case studies, one focused on a local machine and the other on a data center, along with a quantitative experiment exploring optimization design principles.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144180999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Hu, Pengfei Xu, Jin Zhou, Hongbo Fu, Hui Huang
{"title":"StructLayoutFormer: Conditional Structured Layout Generation via Structure Serialization and Disentanglement.","authors":"Xin Hu, Pengfei Xu, Jin Zhou, Hongbo Fu, Hui Huang","doi":"10.1109/TVCG.2025.3574311","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3574311","url":null,"abstract":"<p><p>Structured layouts are preferable in many 2D visual contents (e.g., GUIs, webpages) since the structural information allows convenient layout editing. Computational frameworks can help create structured layouts but require heavy labor input. Existing data-driven approaches are effective in automatically generating fixed layouts but fail to produce layout structures. We present StructLayoutFormer, a novel Transformer-based approach for conditional structured layout generation. We use a structure serialization scheme to represent structured layouts as sequences. To better control the structures of generated layouts, we disentangle the structural information from the element placements. Our approach is the first data-driven approach that achieves conditional structured layout generation and produces realistic layout structures explicitly. We compare our approach with existing data-driven layout generation approaches by including post-processing for structure extraction. Extensive experiments have shown that our approach exceeds these baselines in conditional structured layout generation. We also demonstrate that our approach is effective in extracting and transferring layout structures. We will release the code upon the acceptance of this paper.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144164353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural 3D Face Shape Stylization Based on Single Style Template via Weakly Supervised Learning.","authors":"Peizhi Yan, Rabab K Ward, Qiang Tang, Shan Du","doi":"10.1109/TVCG.2025.3573690","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3573690","url":null,"abstract":"<p><p>3D Face shape stylization refers to transforming a realistic 3D face shape into a different style, such as a cartoon face style. To solve this problem, this paper proposes modeling this task as a deformation transfer problem. This approach significantly reduces labor costs, as the artists would only need to create a single template for each face style. Realistic facial features of the original 3D face e.g. the nose or chin shape, would thus be automatically transferred to those in the style template. Deformation transfer methods, however, have two drawbacks. They are slow and they require re-optimization for every new input face. To address these weaknesses, we propose a neural network-based 3D face shape stylization method. This method is trained through weakly supervised learning, and its template's structure is preserved using our novel templateguided mesh smoothing regularization. Our method is the first learning-based deformation transfer method for 3D face shape stylization. Its employment offers the useful and practical benefit of not requiring paired training data. The experiments show that the quality of the stylized faces obtained by our method is comparable to that of the traditional deformation transfer method, achieving an average Chamfer Distance of approximately 0.01mm. However, our approach significantly boosts the processing speed, achieving a rate approximately 3,000 times faster than the traditional deformation transfer. Project page: https://peizhiyan.github.io/docs/style.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144153015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junpeng Wang, Dennis R Bukenberger, Simon Niedermayr, Christoph Neuhauser, Jun Wu, Rudiger Westermann
{"title":"SGLDBench: A Benchmark Suite for Stress-Guided Lightweight 3D Designs.","authors":"Junpeng Wang, Dennis R Bukenberger, Simon Niedermayr, Christoph Neuhauser, Jun Wu, Rudiger Westermann","doi":"10.1109/TVCG.2025.3573774","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3573774","url":null,"abstract":"<p><p>We introduce the Stress-Guided Lightweight Design Benchmark (SGLDBench), a comprehensive benchmark suite for applying and evaluating material layout strategies to generate stiff, lightweight designs in 3D domains. SGLDBench provides a seamlessly integrated simulation and analysis framework, including six reference strategies and a scalable multigrid elasticity solver to efficiently execute these strategies and validate the stiffness of their results. This facilitates the systematic analysis and comparison of design strategies based on the mechanical properties they achieve. SGLDBench enables the evaluation of diverse load conditions and, through the tight integration of the solver, supports high-resolution designs and stiffness analysis. Additionally, SGLDBench emphasizes visual analysis to explore the relationship between the geometric structure of a design and the distribution of stresses, offering insights into the specific properties and behaviors of different design strategies. SGLDBench's specific features are highlighted through several experiments, comparing the results of reference strategies with respect to geometric and mechanical properties.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144153016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GroupTrackVis: A Visual Analytics Approach for Online Group Discussion-Based Teaching.","authors":"Xiaoyan Kui, Min Zhang, Mingkun Zhang, Ningkai Huang, Yuqi Guo, Jingwei Liu, Chao Zhang, Jiazhi Xia","doi":"10.1109/TVCG.2025.3573653","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3573653","url":null,"abstract":"<p><p>Online group discussions play an important role in education reform by facilitating collaborative learning and knowledge sharing among participants. However, instructors face significant challenges in monitoring discussion progress, tracking student performance and understanding interaction dynamics due to overlapping conversations, time-varying participant behaviors, and hidden interaction patterns. To address these challenges, we propose GroupTrackVis, an interactive visual analytics system that incorporates both advanced algorithms and novel visualization designs, to help instructors analyze group discussions mainly from three perspectives: topic evolution, student performance, and interaction. GroupTrackVis proposes an enhanced topic segmentation algorithm by incorporating word vector weighting and reply relationship analysis, effectively disentangling overlapping discussions. It also extracts six key behavioral attributes from multimodal educational data, offering a comprehensive view of student performance and providing insights into the key factors driving learning outcomes. Additionally, a multi-layer tree network with edge bundling techniques is implemented to clearly visualize the dynamic evolution of student interactions. The integration of algorithms with interactive visualizations enables instructors to explore discussions quickly and dynamically adjust their analysis as the discussion evolves. The effectiveness of GroupTrackVis is demonstrated through two case studies, a user study, and expert interviews, highlighting its ability to support instructors in identifying engaged and disengaged students, and tracking discussion dynamics.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144153013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}