{"title":"Self-supervised reconstruction of re-renderable facial textures from single image","authors":"Mingxin Yang , Jianwei Guo , Xiaopeng Zhang , Zhanglin Cheng","doi":"10.1016/j.cag.2024.104096","DOIUrl":"10.1016/j.cag.2024.104096","url":null,"abstract":"<div><div>Reconstructing high-fidelity 3D facial texture from a single image is a quite challenging task due to the lack of complete face information and the domain gap between the 3D face and 2D image. Further, obtaining re-renderable 3D faces has become a strongly desired property in many applications, where the term ’re-renderable’ demands the facial texture to be spatially complete and disentangled with environmental illumination. In this paper, we propose a new self-supervised deep learning framework for reconstructing high-quality and re-renderable facial albedos from single-view images in the wild. Our main idea is to first utilize a <em>prior generation module</em> based on the 3DMM proxy model to produce an unwrapped texture and a globally parameterized prior albedo. Then we apply a <em>detail refinement module</em> to synthesize the final texture with both high-frequency details and completeness. To further make facial textures disentangled with illumination, we propose a novel detailed illumination representation that is reconstructed with the detailed albedo together. We also design several novel regularization losses on both the albedo and illumination maps to facilitate the disentanglement of these two factors. Finally, by leveraging a differentiable renderer, each face attribute can be jointly trained in a self-supervised manner without requiring ground-truth facial reflectance. Extensive comparisons and ablation studies on challenging datasets demonstrate that our framework outperforms state-of-the-art approaches.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104096"},"PeriodicalIF":2.5,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Psychophysiology of rhythmic stimuli and time experience in virtual reality","authors":"Stéven Picard, Jean Botev","doi":"10.1016/j.cag.2024.104097","DOIUrl":"10.1016/j.cag.2024.104097","url":null,"abstract":"<div><div>Time experience is an essential part of one’s perception of any environment, real or virtual. In this article, from a virtual environment design perspective, we explore how rhythmic stimuli can influence an unrelated cognitive task regarding time experience and performance in virtual reality. This study explicitly includes physiological data to investigate how, overall, experience correlates with psychophysiological observations. The task involves sorting 3D objects by shape, with varying rhythmic stimuli in terms of their tempo and sensory channel (auditory and/or visual) in different trials, to collect subjective measures of time estimation and judgment. The results indicate different effects on time experience and performance depending on the context, such as user fatigue and trial repetition. Depending on the context, a positive impact of audio stimuli or a negative impact of visual stimuli on task performance can be observed, as well as time being underestimated concerning tempo in relation to task familiarity. However, some effects are consistent regardless of context, such as time being judged to pass faster with additional stimuli or consistent correlations between participants’ performance and time experience, suggesting flow-related aspects. We also observe correlations between time experience with eye-tracking data and body temperature, yet some of these correlations may be due to a confounding effect of fatigue. If confirmed as separate from fatigue, these physiological data could be used as reference point for evaluating a user’s time experience. This might be of great interest for designing virtual environments, as purposeful stimuli can strongly influence task performance and time experience, both essential components of virtual environment user experience.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104097"},"PeriodicalIF":2.5,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing MeshNet for 3D shape classification with focal and regularization losses","authors":"Meng Liu, Feiyu Zhao","doi":"10.1016/j.cag.2024.104094","DOIUrl":"10.1016/j.cag.2024.104094","url":null,"abstract":"<div><div>With the development of deep learning and computer vision, an increasing amount of research has focused on applying deep learning models to the recognition and classification of three-dimensional shapes. In classification tasks, differences in sample quantity, feature amount, model complexity, and other aspects among different categories of 3D model data cause significant variations in classification difficulty. However, simple cross-entropy loss is generally used as the loss function, but it is insufficient to address these differences. In this paper, we used MeshNet as the base model and introduced focal loss as a metric for the loss function. Additionally, to prevent deep learning models from developing a preference for specific categories, we incorporated regularization loss. The combined use of focal loss and regularization loss in optimizing the MeshNet model’s loss function resulted in a classification accuracy of up to 92.46%, representing a 0.20% improvement over the original model’s highest accuracy of 92.26%. Furthermore, the average accuracy over the final 50 epochs remained stable at a higher level of 92.01%, reflecting a 0.71% improvement compared to the original MeshNet model’s 91.30%. These results indicate that our method performs better in 3D shape classification task.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104094"},"PeriodicalIF":2.5,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ChatKG: Visualizing time-series patterns aided by intelligent agents and a knowledge graph","authors":"Leonardo Christino , Fernando V. Paulovich","doi":"10.1016/j.cag.2024.104092","DOIUrl":"10.1016/j.cag.2024.104092","url":null,"abstract":"<div><div>Line-chart visualizations of temporal data enable users to identify interesting patterns for the user to inquire about. Using Intelligent Agents (IA), Visual Analytic tools can automatically uncover <em>explicit knowledge</em> related information to said patterns. Yet, visualizing the association of data, patterns, and knowledge is not straightforward. In this paper, we present <em>ChatKG</em>, a novel visual analytics strategy that allows exploratory data analysis of a Knowledge Graph that associates temporal sequences, the patterns found in each sequence, the temporal overlap between patterns, the related knowledge of each given pattern gathered from a multi-agent IA, and the IA’s suggestions of related datasets for further analysis visualized as annotations. We exemplify and informally evaluate ChatKG by analyzing the world’s life expectancy. For this, we implement an oracle that automatically extracts relevant or interesting patterns, populates the Knowledge Graph to be visualized, and, during user interaction, inquires the multi-agent IA for related information and suggests related datasets to be displayed as visual annotations. Our tests and an interview conducted showed that ChatKG is well suited for temporal analysis of temporal patterns and their related knowledge when applied to history studies.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104092"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142357034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yitong Sun , Hanchun Wang , Zhejun Zhang , Cyriel Diels , Ali Asadipour
{"title":"Executing realistic earthquake simulations in unreal engine with material calibration","authors":"Yitong Sun , Hanchun Wang , Zhejun Zhang , Cyriel Diels , Ali Asadipour","doi":"10.1016/j.cag.2024.104091","DOIUrl":"10.1016/j.cag.2024.104091","url":null,"abstract":"<div><div>Earthquakes significantly impact societies and economies, underscoring the need for effective search and rescue strategies. As AI and robotics increasingly support these efforts, the demand for high-fidelity, real-time simulation environments for training has become pressing. Earthquake simulation can be considered as a complex system. Traditional simulation methods, which primarily focus on computing intricate factors for single buildings or simplified architectural agglomerations, often fall short in providing realistic visuals and real-time structural damage assessments for urban environments. To address this deficiency, we introduce a real-time, high visual fidelity earthquake simulation platform based on the Chaos Physics System in Unreal Engine, specifically designed to simulate the damage to urban buildings. Initially, we use a genetic algorithm to calibrate material simulation parameters from Ansys into the Unreal Engine’s fracture system, based on real-world test standards. This alignment ensures the similarity of results between the two systems while achieving real-time capabilities. Additionally, by integrating real earthquake waveform data, we improve the simulation’s authenticity, ensuring it accurately reflects historical events. All functionalities are integrated into a visual user interface, enabling zero-code operation, which facilitates testing and further development by cross-disciplinary users. We verify the platform’s effectiveness through three AI-based tasks: similarity detection, path planning, and image segmentation. This paper builds upon the preliminary earthquake simulation study we presented at IMET 2023, with significant enhancements, including improvements to the material calibration workflow and the method for binding building foundations.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104091"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alister Machado, Alexandru Telea, Michael Behrisch
{"title":"Controlling the scatterplot shapes of 2D and 3D multidimensional projections","authors":"Alister Machado, Alexandru Telea, Michael Behrisch","doi":"10.1016/j.cag.2024.104093","DOIUrl":"10.1016/j.cag.2024.104093","url":null,"abstract":"<div><div>Multidimensional projections are effective techniques for depicting high-dimensional data. The point patterns created by such techniques, or a technique’s <em>visual signature</em>, depend — apart from the data themselves — on the technique design and its parameter settings. Controlling such visual signatures — something that only few projections allow — can bring additional freedom for generating insightful depictions of the data. We present a novel projection technique — ShaRP — that allows explicit control on such visual signatures in terms of shapes of similar-value point clusters (settable to rectangles, triangles, ellipses, and convex polygons) and the projection space (2D or 3D Euclidean or <span><math><msup><mrow><mi>S</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>). We show that ShaRP scales computationally well with dimensionality and dataset size, provides its signature-control by a small set of parameters, allows trading off projection quality to signature enforcement, and can be used to generate decision maps to explore the behavior of trained machine-learning classifiers.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104093"},"PeriodicalIF":2.5,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142322382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bernardo Marques, Beatriz Sousa Santos, Paulo Dias
{"title":"Ten years of immersive education: Overview of a Virtual and Augmented Reality course at postgraduate level","authors":"Bernardo Marques, Beatriz Sousa Santos, Paulo Dias","doi":"10.1016/j.cag.2024.104088","DOIUrl":"10.1016/j.cag.2024.104088","url":null,"abstract":"<div><div>In recent years, the market has seen the emergence of numerous affordable sensors, interaction devices, and displays, which have greatly facilitated the adoption of Virtual and Augmented Reality (VR/AR) across various applications. However, developing these applications requires a solid understanding of the field and specific technical skills, which are often lacking in current Computer Science and Engineering education programs. This work details an extended version from a Eurographics 2024 Education Paper, reporting a post-graduate-level course that has been taught for the past ten years to almost 200 students, across several Master’s programs. The course introduces students to the fundamental principles, methods, and tools of VR/AR. Its primary objective is to equip students with the knowledge necessary to understand, create, implement, and evaluate applications using these technologies. The paper provides insights into the course structure, key topics covered, assessment methods, as well as the devices and infrastructure utilized. It also includes an overview of various practical projects completed over the years. Among other reflections, we discuss the challenges of teaching this course, particularly due to the rapid evolution of the field, which necessitates constant updates to the curriculum. Finally, future perspectives for the course are outlined.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104088"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324002231/pdfft?md5=f05085791d28d06cef00928e6ebd0b31&pid=1-s2.0-S0097849324002231-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142312679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dimeng Zhang , JiaYao Li , Zilong Chen , Yuntao Zou
{"title":"Efficient image generation with Contour Wavelet Diffusion","authors":"Dimeng Zhang , JiaYao Li , Zilong Chen , Yuntao Zou","doi":"10.1016/j.cag.2024.104087","DOIUrl":"10.1016/j.cag.2024.104087","url":null,"abstract":"<div><div>The burgeoning field of image generation has captivated academia and industry with its potential to produce high-quality images, facilitating applications like text-to-image conversion, image translation, and recovery. These advancements have notably propelled the growth of the metaverse, where virtual environments constructed from generated images offer new interactive experiences, especially in conjunction with digital libraries. The technology creates detailed high-quality images, enabling immersive experiences. Despite diffusion models showing promise with superior image quality and mode coverage over GANs, their slow training and inference speeds have hindered broader adoption. To counter this, we introduce the Contour Wavelet Diffusion Model, which accelerates the process by decomposing features and employing multi-directional, anisotropic analysis. This model integrates an attention mechanism to focus on high-frequency details and a reconstruction loss function to ensure image consistency and accelerate convergence. The result is a significant reduction in training and inference times without sacrificing image quality, making diffusion models viable for large-scale applications and enhancing their practicality in the evolving digital landscape.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104087"},"PeriodicalIF":2.5,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Cannavò, Francesco Bottino, Fabrizio Lamberti
{"title":"Supporting motion-capture acting with collaborative Mixed Reality","authors":"Alberto Cannavò, Francesco Bottino, Fabrizio Lamberti","doi":"10.1016/j.cag.2024.104090","DOIUrl":"10.1016/j.cag.2024.104090","url":null,"abstract":"<div><div>Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors’ ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104090"},"PeriodicalIF":2.5,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LightingFormer: Transformer-CNN hybrid network for low-light image enhancement","authors":"Cong Bi , Wenhua Qian , Jinde Cao , Xue Wang","doi":"10.1016/j.cag.2024.104089","DOIUrl":"10.1016/j.cag.2024.104089","url":null,"abstract":"<div><div>Recent deep-learning methods have shown promising results in low-light image enhancement. However, current methods often suffer from noise and artifacts, and most are based on convolutional neural networks, which have limitations in capturing long-range dependencies resulting in insufficient recovery of extremely dark parts in low-light images. To tackle these issues, this paper proposes a novel Transformer-based low-light image enhancement network called LightingFormer. Specifically, we propose a novel Transformer-CNN hybrid block that captures global and local information via mixed attention. It combines the advantages of the Transformer in capturing long-range dependencies and the advantages of CNNs in extracting low-level features and enhancing locality to recover extremely dark parts and enhance local details in low-light images. Moreover, we adopt the U-Net discriminator to enhance different regions in low-light images adaptively, avoiding overexposure or underexposure, and suppressing noise and artifacts. Extensive experiments show that our method outperforms the state-of-the-art methods quantitatively and qualitatively. Furthermore, the application to object detection demonstrates the potential of our method in high-level vision tasks.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104089"},"PeriodicalIF":2.5,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}