{"title":"Dual-COPE: A novel prior-based category-level object pose estimation network with dual Sim2Real unsupervised domain adaptation module","authors":"Xi Ren , Nan Guo , Zichen Zhu , Xinbei Jiang","doi":"10.1016/j.cag.2024.104045","DOIUrl":"10.1016/j.cag.2024.104045","url":null,"abstract":"<div><p>Category-level pose estimation offers the generalization ability to novel objects unseen during training, which has attracted increasing attention in recent years. Despite the advantage, annotating real-world data with pose label is intricate and laborious. Although using synthetic data with free annotations can greatly reduce training costs, the Synthetic-to-Real (Sim2Real) domain gap could result in a sharp performance decline on real-world test. In this paper, we propose Dual-COPE, a novel prior-based category-level object pose estimation method with dual Sim2Real domain adaptation to avoid expensive real pose annotations. First, we propose an estimation network featured with conjoined prior deformation and transformer-based matching to realize high-precision pose prediction. Upon that, an efficient dual Sim2Real domain adaptation module is further designed to reduce the feature distribution discrepancy between synthetic and real-world data both semantically and geometrically, thus maintaining superior performance on real-world test. Moreover, the adaptation module is loosely coupled with estimation network, allowing for easy integration with other methods without any additional inference overhead. Comprehensive experiments show that Dual-COPE outperforms existing unsupervised methods and achieves state-of-the-art precision under supervised settings.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104045"},"PeriodicalIF":2.5,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongjin Huang , Nan Wang , Xinghan Huang , Jiantao Qu , Shiyu Zhang
{"title":"Mesh-controllable multi-level-of-detail text-to-3D generation","authors":"Dongjin Huang , Nan Wang , Xinghan Huang , Jiantao Qu , Shiyu Zhang","doi":"10.1016/j.cag.2024.104039","DOIUrl":"10.1016/j.cag.2024.104039","url":null,"abstract":"<div><p>Text-to-3D generation is a challenging but significant task and has gained widespread attention. Its capability to rapidly generate 3D digital assets holds huge potential application value in fields such as film, video games, and virtual reality. However, current methods often face several drawbacks, including long generation times, difficulties with the multi-face Janus problem, and issues like chaotic topology and redundant structures during mesh extraction. Additionally, the lack of control over the generated results limits their utility in downstream applications. To address these problems, we propose a novel text-to-3D framework capable of generating meshes with high fidelity and controllability. Our approach can efficiently produce meshes and textures that match the text description and the desired level of detail (LOD) by specifying input text and LOD preferences. This framework consists of two stages. In the coarse stage, 3D Gaussians are employed to accelerate generation speed, and weighted positive and negative prompts from various observation perspectives are used to address the multi-face Janus problem in the generated results. In the refinement stage, mesh vertices and faces are iteratively refined to enhance surface quality and output meshes and textures that meet specified LOD requirements. Compared to the state-of-the-art text-to-3D methods, extensive experiments demonstrate that the proposed method performs better in solving the multi-face Janus problem, enabling the rapid generation of 3D meshes with enhanced prompt adherence. Furthermore, the proposed framework can generate meshes with enhanced topology, offering controllable vertices and faces with textures featuring UV adaptation to achieve multi-level-of-detail(LODs) outputs. Specifically, the proposed method can preserve the output’s relevance to input texts during simplification, making it better suited for mesh editing and rendering efficiency. User studies also indicate that our framework receives higher evaluations compared to other methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104039"},"PeriodicalIF":2.5,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141998269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review of motion retargeting techniques for 3D character facial animation","authors":"ChangAn Zhu, Chris Joslin","doi":"10.1016/j.cag.2024.104037","DOIUrl":"10.1016/j.cag.2024.104037","url":null,"abstract":"<div><p>3D face animation has been a critical component of character animation in a wide range of media since the early 90’s. The conventional process for animating a 3D face is usually keyframe-based, which is labor-intensive. Therefore, the film and game industries have started using live-action actors’ performances to animate the faces of 3D characters, the process is also known as performance-driven facial animation. At the core of performance-driven facial animation is facial motion retargeting, which transfers the source facial motions to a target 3D face. However, facial motion retargeting still has many limitations that influence its capability to further assist the facial animation process. Existing motion retargeting frameworks cannot accurately transfer the source motion’s semantic information (i.e., meaning and intensity of the motion), especially when applying the motion to non-human-like or stylized target characters. The retargeting quality relies on the parameterization of the target face, which is time-consuming to build and usually not generalizable across proportionally different faces. In this survey paper, we review the literature relating to 3D facial motion retargeting methods and the relevant topics within this area. We provide a systematic understanding of the essential modules of the retargeting pipeline, a taxonomy of the available approaches under these modules, and a thorough analysis of their advantages and limitations with research directions that could potentially contribute to this area. We also contributed a 3D character categorization matrix, which has been used in this survey and might be useful for future research to evaluate the character compatibility of their retargeting or face parameterization methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104037"},"PeriodicalIF":2.5,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001729/pdfft?md5=887467d22bf59df3534253c1761b0e20&pid=1-s2.0-S0097849324001729-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141990816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Foreword to the Special Section on Smart Tools and Applications in Graphics (STAG 2023)","authors":"Nicola Capece , Katia Lupinetti , Ugo Erra , Francesco Banterle","doi":"10.1016/j.cag.2024.104036","DOIUrl":"10.1016/j.cag.2024.104036","url":null,"abstract":"<div><p>The Special Section contains extended and revised versions of the best papers presented at the 10th Conference on Smart Tools and Applications in Graphics (STAG 2023), held in Matera on November 16–17, 2023. Four papers were selected by appointed members from the Program Committee; extended versions were submitted and further reviewed by external experts. The result is a rich collection of papers spanning diverse domains: from shape analysis and computational geometry to advanced applications in machine learning, virtual interaction, and digital fabrication. Topics include shape modeling, functional maps, and point clouds, highlighting cutting-edge research in user experience and interaction design.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104036"},"PeriodicalIF":2.5,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haya Almaree , Roland Fischer , René Weller , Verena Uslar , Dirk Weyhe , Gabriel Zachmann
{"title":"Enhancing anatomy learning through collaborative VR? An advanced investigation","authors":"Haya Almaree , Roland Fischer , René Weller , Verena Uslar , Dirk Weyhe , Gabriel Zachmann","doi":"10.1016/j.cag.2024.104019","DOIUrl":"10.1016/j.cag.2024.104019","url":null,"abstract":"<div><p>Common techniques for anatomy education in medicine include lectures and cadaver dissection, as well as the use of replicas. However, recent advances in virtual reality (VR) technology have led to the development of specialized VR tools for teaching, training, and other purposes. The use of VR technology has the potential to greatly enhance the learning experience for students. These tools offer highly interactive and engaging learning environments that allow students to inspect and interact with virtual 3D anatomical structures repeatedly, intuitively, and immersively. Additionally, multi-user VR environments can facilitate collaborative learning, which has the potential to enhance the learning experience even further. However, the effectiveness of collaborative learning in VR has not been adequately explored. Therefore, we conducted two user studies, each with <span><math><mrow><msub><mrow><mi>n</mi></mrow><mrow><mn>1</mn><mo>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mn>33</mn></mrow></math></span> participants, to evaluate the effectiveness of virtual collaboration in the context of anatomy learning, and compared it to individual learning. For our two studies, we developed a multi-user VR anatomy learning application using UE4. Our results demonstrate that our VR Anatomy Atlas offers an engaging and effective learning experience for anatomy, both individually and collaboratively. However, we did not find any significant advantages of collaborative learning in terms of learning effectiveness or motivation, despite the multi-user group spending more time in the learning environment. In fact, motivation tended to be slightly lower. Although the usability was rather high for the single-user condition, it tended to be lower for the multi-user group in one of the two studies, which may have had a slightly negative effect. However, in the second study, the usability scores were similarly high for both groups. The absence of advantages for collaborative learning may be due to the more complex environment and higher cognitive load. In consequence, more research into collaborative VR learning is needed to determine the relevant factors promoting collaborative learning in VR and the settings in which individual or collaborative learning in VR is more effective, respectively.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104019"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Romeiro , Elmar Eisemann , Ricardo Marroquim
{"title":"Retinal pre-filtering for light field displays","authors":"Rafael Romeiro , Elmar Eisemann , Ricardo Marroquim","doi":"10.1016/j.cag.2024.104033","DOIUrl":"10.1016/j.cag.2024.104033","url":null,"abstract":"<div><p>The display coefficients that produce the signal emitted by a light field display are usually calculated to approximate the radiance over a set of sampled rays in the light field space. However, not all information contained in the light field signal is of equal importance to an observer. We propose a retinal pre-filtering of the light field samples that takes into account the image formation process of the observer to determine display coefficients that will ultimately produce better retinal images for a range of focus distances. We demonstrate a significant increase in image definition without changing the display resolution.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104033"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001687/pdfft?md5=a3ada2f14da0a4ee885b3020bef4c154&pid=1-s2.0-S0097849324001687-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142040308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paulo Knob, Greice Pinho, Gabriel Fonseca Silva, Rubens Montanha, Vitor Peres, Victor Araujo, Soraia Raupp Musse
{"title":"Surveying the evolution of virtual humans expressiveness toward real humans","authors":"Paulo Knob, Greice Pinho, Gabriel Fonseca Silva, Rubens Montanha, Vitor Peres, Victor Araujo, Soraia Raupp Musse","doi":"10.1016/j.cag.2024.104034","DOIUrl":"10.1016/j.cag.2024.104034","url":null,"abstract":"<div><p>Virtual Humans (VHs) emerged over 50 years ago and have since experienced notable advancements. Initially, developing and animating VHs posed significant challenges. However, modern technology, both commercially available and freely accessible, has democratized the creation and animation processes, making them more accessible to users, programmers, and designers. These advancements have led to the replication of authentic traits and behaviors of real actors in VHs, resulting in visually convincing and behaviorally lifelike characters. As a consequence, many research areas arise as functional VH technologies. This paper explored the evolution of four areas and emerging trends related to VHs while examining some of the implications and challenges posed by highly realistic characters within these domains.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104034"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141942921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lifang Chen , Yuchen Xiong , Yanjie Zhang , Ruiyin Yu , Lian Fang , Defeng Liu
{"title":"SP-SeaNeRF: Underwater Neural Radiance Fields with strong scattering perception","authors":"Lifang Chen , Yuchen Xiong , Yanjie Zhang , Ruiyin Yu , Lian Fang , Defeng Liu","doi":"10.1016/j.cag.2024.104025","DOIUrl":"10.1016/j.cag.2024.104025","url":null,"abstract":"<div><p>Water and light interactions cause color shifts and blurring in underwater images, while dynamic underwater illumination further disrupts scene consistency, resulting in poor performance of optical image-based reconstruction methods underwater. Although Neural Radiance Fields (NeRF) can describe aqueous medium through volume rendering, applying it directly underwater may induce artifacts and floaters. We propose SP-SeaNeRF, which uses micro MLP to predict water column parameters and simulates the degradation process as a combination of real colors and scattered colors in underwater images, enhancing the model’s perception of scattering. We use illumination embedding vectors to learn the illumination bias within the images, in order to prevent dynamic illumination from disrupting scene consistency. We have introduced a novel sampling module, which focuses on maximum weight points, effectively improves training and inference speed. We evaluated our proposed method on SeaThru-NeRF and Neuralsea underwater datasets. The experimental results show that our method exhibits superior underwater color restoration ability, outperforming existing underwater NeRF in terms of reconstruction quality and speed.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104025"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142007101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matt Gottsacker , Hiroshi Furuya , Zubin Datta Choudhary , Austin Erickson , Ryan Schubert , Gerd Bruder , Michael P. Browne , Gregory F. Welch
{"title":"Investigating the relationships between user behaviors and tracking factors on task performance and trust in augmented reality","authors":"Matt Gottsacker , Hiroshi Furuya , Zubin Datta Choudhary , Austin Erickson , Ryan Schubert , Gerd Bruder , Michael P. Browne , Gregory F. Welch","doi":"10.1016/j.cag.2024.104035","DOIUrl":"10.1016/j.cag.2024.104035","url":null,"abstract":"<div><p>This research paper explores the impact of augmented reality (AR) tracking characteristics, specifically an AR head-worn display’s tracking registration accuracy and precision, on users’ spatial abilities and subjective perceptions of trust in and reliance on the technology. Our study aims to clarify the relationships between user performance and the different behaviors users may employ based on varying degrees of trust in and reliance on AR. Our controlled experimental setup used a 360° field-of-regard search-and-selection task and combines the immersive aspects of a CAVE-like environment with AR overlays viewed with a head-worn display.</p><p>We investigated three levels of simulated AR tracking errors in terms of both accuracy and precision (+0°, +1°, +2°). We controlled for four user task behaviors that correspond to different levels of trust in and reliance on an AR system: <em>AR-Only</em> (only relying on AR), <em>AR-First</em> (prioritizing AR over real world), <em>Real-Only</em> (only relying on real world), and <em>Real-First</em> (prioritizing real world over AR). By controlling for these behaviors, our results showed that even small amounts of AR tracking errors had noticeable effects on users’ task performance, especially if they relied completely on the AR cues (AR-Only). Our results link AR tracking characteristics with user behavior, highlighting the importance of understanding these elements to improve AR technology and user satisfaction.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104035"},"PeriodicalIF":2.5,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Mittenentzwei , Sophie Mlitzke , Darija Grisanova , Kai Lawonn , Bernhard Preim , Monique Meuschke
{"title":"Visually communicating pathological changes: A case study on the effectiveness of phong versus outline shading","authors":"Sarah Mittenentzwei , Sophie Mlitzke , Darija Grisanova , Kai Lawonn , Bernhard Preim , Monique Meuschke","doi":"10.1016/j.cag.2024.104023","DOIUrl":"10.1016/j.cag.2024.104023","url":null,"abstract":"<div><p>In this paper, we investigate the suitability of different visual representations of pathological growth and shrinkage using surface models of intracranial aneurysms and liver tumors. By presenting complex medical information in a visually accessible manner, audiences can better understand and comprehend the progression of pathological structures. Previous work in medical visualization provides an extensive design space for visualizing medical image data. However, determining which visualization techniques are appropriate for a general audience has not been thoroughly investigated.</p><p>We conducted a user study (n = 40) to evaluate different visual representations in terms of their suitability for solving tasks and their aesthetics. We created surface models representing the evolution of pathological structures over multiple discrete time steps and visualized them using illumination-based and illustrative techniques. Our results indicate that users’ aesthetic preferences largely coincide with their preferred visualization technique for task-solving purposes. In general, the illumination-based technique has been preferred to the illustrative technique, but the latter offers great potential for increasing the accessibility of visualizations to users with color vision deficiencies.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104023"},"PeriodicalIF":2.5,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001584/pdfft?md5=290698cd5eeb6b5b6aca798a4452f2fb&pid=1-s2.0-S0097849324001584-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}