Nannan Zhang, Zhenyu Xie, Zhengwentai Sun, Hairui Zhu, Zirong Jin, Nan Xiang, Xiaoguang Han, Song Wu
{"title":"ViTon-GUN: Person-to-Person Virtual Try-on via Garment Unwrapping.","authors":"Nannan Zhang, Zhenyu Xie, Zhengwentai Sun, Hairui Zhu, Zirong Jin, Nan Xiang, Xiaoguang Han, Song Wu","doi":"10.1109/TVCG.2025.3550776","DOIUrl":"10.1109/TVCG.2025.3550776","url":null,"abstract":"<p><p>The image-based Person-to-Person (P2P) virtual try-on, involving the direct transfer of garments from one person to another, is one of the most promising applications of human-centric image generation. However, existing approaches struggle to accurately learn the clothing deformation when directly warping the garment from the source pose onto the target pose. To address this, we propose Person-to-Person virtual try-on via Garment UNwrapping, a novel framework dubbed as ViTon-GUN. Specifically, we divide the P2P task into two subtasks: Person-to-Garment (P2G) and Garment-to-Person (G2P). The P2G aims to unwrap the target garment from a source pose to a canonical representation based on A-Pose. In the P2G stage, we enable the implementation of a flow-based P2G scheme by introducing an A-Pose estimator and establishing comprehensive training conditions. Building upon this step-wise strategy, we introduce a novel pipeline for P2P try-on. Once trained, the P2G strategy can serve as a \"plug-and-play\" module, which efficiently adapts existing diffusion-based pre-trained G2P models to P2P try-on without further training. Quantitative and qualitative experiments demonstrate that our ViTon-GUN performs remarkably well on P2P try-on, even for dresses with intricate design details.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143735690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinhao Tong, Tianjia Shao, Yanlin Weng, Yin Yang, Kun Zhou
{"title":"As-Rigid-As-Possible Deformation of Gaussian Radiance Fields.","authors":"Xinhao Tong, Tianjia Shao, Yanlin Weng, Yin Yang, Kun Zhou","doi":"10.1109/TVCG.2025.3555404","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3555404","url":null,"abstract":"<p><p>3D Gaussian Splatting (3DGS) models radiance fields as sparsely distributed 3D Gaussians, providing a compelling solution to novel view synthesis at high resolutions and real-time frame rates. However, deforming objects represented by 3D Gaussians remains a challenging task. Existing methods deform a 3DGS object by editing Gaussians geometrically. These approaches ignore the fact that it is the radiance field that rasterizes and renders the final image. The inconsistency between the deformed 3D Gaussians and the desired radiance field inevitably leads to artifacts in the final results. In this paper, we propose an interactive method for as-rigid-as-possible (ARAP) deformation of the Gaussian radiance fields. Specifically, after performing geometric edits on the Gaussians, we further optimize Gaussians to ensure its rasterization yields a similar result as the deformed radiance field. To facilitate this objective, we design radial features to mathematically describe the radial difference before and after the deformation, which are densely sampled across the radiance field. Additionally, we propose an adaptive anisotropic spatial low-pass filter to prevent aliasing issues during sampling and to preserve the field with the varying non-uniform sampling intervals. Users can interactively employ this tool to achieve large-scale ARAP deformations of the radiance field. Since our method maintains the consistency of the Gaussian radiance field before and after deformation, it avoids artifacts that are common in existing 3DGS deformation frameworks. Meanwhile, our method keeps the high quality and efficiency of 3DGS in rendering.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Neural Acoustic Fields for Immersive Mixed Reality.","authors":"Guansen Tong, Johnathan Chi-Ho Leung, Xi Peng, Haosheng Shi, Liujie Zheng, Shengze Wang, Arryn Carlos O'Brien, Ashley Paula-Ann Neall, Grace Fei, Martim Gaspar, Praneeth Chakravarthula","doi":"10.1109/TVCG.2025.3549898","DOIUrl":"10.1109/TVCG.2025.3549898","url":null,"abstract":"<p><p>We introduce multimodal neural acoustic fields for synthesizing spatial sound and enabling the creation of immersive auditory experiences from novel viewpoints and in completely unseen new environments, both virtual and real. Extending the concept of neural radiance fields to acoustics, we develop a neural network-based model that maps an environment's geometric and visual features to its audio characteristics. Specifically, we introduce a novel hybrid transformer-convolutional neural network to accomplish two core tasks: capturing the reverberation characteristics of a scene from audio-visual data, and generating spatial sound in an unseen new environment from signals recorded at sparse positions and orientations within the original scene. By learning to represent spatial acoustics in a given environment, our approach enables creation of realistic immersive auditory experiences, thereby enhancing the sense of presence in augmented and virtual reality applications. We validate the proposed approach on both synthetic and real-world visual-acoustic data and demonstrate that our method produces nonlinear acoustic effects such as reverberations, and improves spatial audio quality compared to existing methods. Furthermore, we also conduct subjective user studies and demonstrate that the proposed framework significantly improves audio perception in immersive mixed reality applications.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision.","authors":"Zhonghan Zhao, Wenhao Chai, Shengyu Hao, Wenhao Hu, Guanhong Wang, Shidong Cao, Mingli Song, Jenq-Neng Hwang, Gaoang Wang","doi":"10.1109/TVCG.2025.3554801","DOIUrl":"10.1109/TVCG.2025.3554801","url":null,"abstract":"<p><p>Deep learning has the potential to revolutionize sports performance, with applications ranging from perception and comprehension to decision. This paper presents a comprehensive survey of deep learning in sports performance, focusing on three main aspects: algorithms, datasets and virtual environments, and challenges. Firstly, we discuss the hierarchical structure of deep learning algorithms in sports performance which includes perception, comprehension and decision while comparing their strengths and weaknesses. Secondly, we list widely used existing datasets in sports and highlight their characteristics and limitations. Finally, we summarize current challenges and point out future trends of deep learning in sports. Our survey provides valuable reference material for researchers interested in deep learning in sports applications.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio-visual aware Foveated Rendering.","authors":"Xuehuai Shi, Yucheng Li, Jiaheng Li, Jian Wu, Jieming Yin, Xiaobai Chen, Lili Wang","doi":"10.1109/TVCG.2025.3554737","DOIUrl":"10.1109/TVCG.2025.3554737","url":null,"abstract":"<p><p>With the increasing complexity of geometry and rendering effects in virtual reality (VR) scenes, existing foveated rendering methods for VR head-mounted displays (HMDs) struggle to meet users' demands for VR scene rendering with high frame rates (≥ 60 f ps for rendering binocular foveated images in VR scenes containing over 50M triangles). Current research validates that auditory content affects the perception of the human visual system (HVS). However, existing foveated rendering methods primarily model the HVS's eccentricity-dependent visual perception ability on the visual content in VR while ignoring the impact of auditory content on the HVS's visual perception. In this paper, we introduce an auditory-content-based perceived rendering quality analysis to quantify the impact of visual perception under different auditory conditions in foveated rendering. Based on the analysis results, we propose an audio-visual aware foveated rendering method (AvFR). AvFR first constructs an audio-visual feature-driven perception model that predicts the eccentricity-based visual perception in real time by combining the scene's audio-visual content, and then proposes a foveated rendering cost optimization algorithm to adaptively control the shading rate of different regions with the guidance of the perception model. In complex scenes with visual and auditory content containing over 1.17M triangles, AvFR renders high-quality binocular foveated images at an average frame rate of 116 f ps. The results of the main user study and performance evaluation validate that AvFR achieves significant performance improvement (up to 1.4× speedup) without lowering the perceived visual quality compared with the state-of-the-art VR-HMD foveated rendering method.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human Performance and Perception of Uncertainty Visualizations in Geospatial Applications: A Scoping Review.","authors":"Ryan Tennant, Tania Randall","doi":"10.1109/TVCG.2025.3554969","DOIUrl":"10.1109/TVCG.2025.3554969","url":null,"abstract":"<p><p>Geospatial data are often uncertain due to measurement, spatial, or temporal limitations. A knowledge gap exists about how geospatial uncertainty visualization techniques influence human factors measures. This comprehensive review synthesized the current literature on visual representations of uncertainty in geospatial data applications, identifying the breadth of techniques and the relationships between strategies and human performance and perception outcomes. Eligible articles described and evaluated at least one method for representing uncertainty in geographical data with participants, including land, ocean, weather, climate, and positioning data. Forty articles were included. Uncertainty was visualized using multivariate and univariate maps through colours, shapes, boundary regions, textures, symbols, grid noise, and text. There were varying effects, and no definitive superior method was identified. The predominant user focus was on novices. Trends were observed in supporting users understand uncertainty, user preferences, confidence, decision-making performance, and response times for different techniques and application contexts. The findings highlight the impacts of different categorizations within colour and shape techniques, heterogeneity in perception and performance evaluation, performance and perception mismatch, and differences and similarities between novices and experts. Contextual factors and user characteristics, including understanding the decision-maker's tasks, user type, and desired outcomes for decision-support appear to be important factors influencing the design of effective uncertainty visualizations. Future research on geospatial applications of uncertainty visualizations can expand on the observed trends with consistent and standardized measurement and reporting, further explore human performance and perception impacts with 3-dimensional and interactive uncertainty visualizations, and perform real-world evaluations within various contexts.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143733758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HO-NeRF: Radiance Fields Reconstruction for Two-Hand-Held Objects.","authors":"Xinxin Liu, Qi Zhang, Xin Huang, Ying Feng, Guoqing Zhou, Qing Wang","doi":"10.1109/TVCG.2025.3553975","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3553975","url":null,"abstract":"<p><p>Our work aims to reconstruct the appearance and geometry of the two-hand-held object from a sequence of color images. In contrast to traditional single-hand-held manipulation, two-hand-holding allows more flexible interaction, thereby providing back views of the object, which is particularly convenient for reconstruction but generates complex view-dependent occlusions. The recent development of neural rendering provides new potential for hand-held object reconstruction. In this paper, we propose a novel neural representation-based framework to recover radiance fields of the two-hand-held object, named HO-NeRF. We first design an object-centric semantic module based on the geometric signed distance function cues to predict 3D object-centric regions and develop the view-dependent visible module based on the image-related cues to label 2D occluded regions. We then combine them to obtain a 2D visible mask that adaptively guides ray sampling on the object for optimization. We also provide a newly collected HO dataset to validate the proposed method. Experiments show that our method achieves superior performance on reconstruction completeness and view-consistency synthesis compared to the state-of-the-art methods.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143712668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Neural Volume Rendering via Learning View-Dependent Integral Approximation.","authors":"Yifan Wang, Jun Xu, Yuan Zeng, Yi Gong","doi":"10.1109/TVCG.2025.3554692","DOIUrl":"10.1109/TVCG.2025.3554692","url":null,"abstract":"<p><p>Neural radiance fields (NeRFs) have achieved impressive view synthesis results by learning an implicit volumetric representation from multi-view images. To project the implicit representation into an image, NeRF employs volume rendering that approximates the continuous integrals of rays as an accumulation of the colors and densities of the sampled points. Although this approximation enables efficient rendering, it ignores the direction information in point intervals, resulting in ambiguous features and limited reconstruction quality. In this paper, we propose a learning method that utilizes learnable view-dependent features to improve scene representation and reconstruction. We model the volume rendering integral with a piecewise constant volume density and spherical harmonic-guided view-dependent features, facilitating ambiguity elimination while preserving the rendering efficiency. In addition, we introduce a regularization term that restricts the anisotropic representation effect to be local, with negligible effect on geometry representations, and that encourages recovering the correct geometry. Our method is flexible and can be plugged into NeRF-based frameworks. Extensive experiments show that the proposed representation can boost the rendering quality of various NeRFs and achieve state-of-the-art rendering performance on both synthetic and real-world scenes.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Mixed Reality Car A-Pillar Design Support System Utilizing Projection Mapping.","authors":"Ryotaro Yoshida, Toshihiro Hara, Yusaku Takeda, Kenji Murase, Daisuke Iwai, Kosuke Sato","doi":"10.1109/TVCG.2025.3554037","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3554037","url":null,"abstract":"<p><p>Projection mapping (PM) is useful in the product design process, since it seamlessly bridges a physical mockup and its digital twin by allowing designers to interactively explore new textures, colors, and shapes without the need to create new physical mockups. While PM has proven effective for car interior design, previous research focused solely on supporting the design of dashboards and instrument panels, neglecting evaluation in realistic driving scenarios. This paper introduces a self-contained car interior design support system that extends beyond the dashboard to include the A-pillars. Additionally, to enable designers to evaluate their designs in authentic driving conditions, we integrate a driving simulator, complete with a motion platform, into the PM system. Through the construction of a prototype, we demonstrate the feasibility of our proposed system. Finally, through user studies, we derive guidelines for PM-based car interior design to optimize the user experience.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143702484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emiliano Santarnecchi, Emanuele Balloni, Marina Paolanti, Emanuele Frontoni, Lorenzo Stacchio, Primo Zingaretti, Roberto Pierdicca
{"title":"MineVRA: Exploring the Role of Generative AI-Driven Content Development in XR Environments through a Context-Aware Approach.","authors":"Emiliano Santarnecchi, Emanuele Balloni, Marina Paolanti, Emanuele Frontoni, Lorenzo Stacchio, Primo Zingaretti, Roberto Pierdicca","doi":"10.1109/TVCG.2025.3549160","DOIUrl":"10.1109/TVCG.2025.3549160","url":null,"abstract":"<p><p>The convergence of Artificial Intelligence (AI), Computer Vision (CV), Computer Graphics (CG), and Extended Reality (XR) is driving innovation in immersive environments. A key challenge in these environments is the creation of personalized 3D assets, traditionally achieved through manual modeling, a time-consuming process that often fails to meet individual user needs. More recently, Generative AI (GenAI) has emerged as a promising solution for automated, context-aware content generation. In this paper, we present MineVRA (MultImodal generative artificial iNtelligence for contExt-aware Virtual Reality Assets), a novel Human-In-The-Loop (HITL) XR framework that integrates GenAI to facilitate coherent and adaptive 3D content generation in immersive scenarios. To evaluate the effectiveness of this approach, we conducted a comparative user study analyzing the performance and user satisfaction of GenAI-generated 3D objects compared to those generated by Sketchfab in different immersive contexts. The results suggest that GenAI can significantly complement traditional 3D asset libraries, with valuable design implications for the development of human-centered XR environments.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143675020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}