Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu
{"title":"Hybrid attention adaptive sampling network for human pose estimation in videos","authors":"Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu","doi":"10.1002/cav.2244","DOIUrl":"https://doi.org/10.1002/cav.2244","url":null,"abstract":"<p>Human pose estimation in videos often uses sampling strategies like sparse uniform sampling and keyframe selection. Sparse uniform sampling can miss spatial-temporal relationships, while keyframe selection using CNNs struggles to fully capture these relationships and is costly. Neither strategy ensures the reliability of pose data from single-frame estimators. To address these issues, this article proposes an efficient and effective hybrid attention adaptive sampling network. This network includes a dynamic attention module and a pose quality attention module, which comprehensively consider the dynamic information and the quality of pose data. Additionally, the network improves efficiency through compact uniform sampling and parallel mechanism of multi-head self-attention. Our network is compatible with various video-based pose estimation frameworks and demonstrates greater robustness in high degree of occlusion, motion blur, and illumination changes, achieving state-of-the-art performance on Sub-JHMDB dataset.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142013622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hangyeol Kang, Maher Ben Moussa, Nadia Magnenat Thalmann
{"title":"Nadine: A large language model-driven intelligent social robot with affective capabilities and human-like memory","authors":"Hangyeol Kang, Maher Ben Moussa, Nadia Magnenat Thalmann","doi":"10.1002/cav.2290","DOIUrl":"https://doi.org/10.1002/cav.2290","url":null,"abstract":"<p>In this work, we describe our approach to developing an intelligent and robust social robotic system for the Nadine social robot platform. We achieve this by integrating large language models (LLMs) and skillfully leveraging the powerful reasoning and instruction-following capabilities of these types of models to achieve advanced human-like affective and cognitive capabilities. This approach is novel compared to the current state-of-the-art LLM-based agents which do not implement human-like long-term memory or sophisticated emotional capabilities. We built a social robot system that enables generating appropriate behaviors through multimodal input processing, bringing episodic memories accordingly to the recognized user, and simulating the emotional states of the robot induced by the interaction with the human partner. In particular, we introduce an LLM-agent frame for social robots, social robotics reasoning and acting, serving as a core component for the interaction module in our system. This design has brought forth the advancement of social robots and aims to increase the quality of human–robot interaction.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cav.2290","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141986061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jackson Yang, Xiaoping Che, Chenxin Qu, Xiaofei Di, Haiming Liu
{"title":"Enhancing virtual reality exposure therapy: Optimizing treatment outcomes for agoraphobia through advanced simulation and comparative analysis","authors":"Jackson Yang, Xiaoping Che, Chenxin Qu, Xiaofei Di, Haiming Liu","doi":"10.1002/cav.2291","DOIUrl":"https://doi.org/10.1002/cav.2291","url":null,"abstract":"<p>This paper investigates the application of Virtual Reality Exposure Therapy (VRET) to treat agoraphobia, focusing on two pivotal research questions derived from identified gaps in current therapeutic approaches. The first question (RQ1) addresses the development of complex VR environments to enhance therapy's effectiveness by simulating real-world anxiety triggers. The second question (RQ2) examines the differential impact of these VR environments on agoraphobic and nonagoraphobic participants through rigorous comparative analyses using <i>t</i>-tests. Methodologies include advanced data processing techniques for electrodermal activity (EDA) and eye-tracking metrics to assess the anxiety levels induced by these environments. Additionally, qualitative methods such as structured interviews and questionnaires complement these measurements, providing deeper insights into the subjective experiences of participants. Video recordings of sessions using Unity software offer a layer of data, enabling the study to replay and analyze interactions within the VR environment meticulously. The experimental results confirm the efficacy of VR settings in eliciting significant physiological and psychological responses from participants, substantiating the VR scenarios' potential as a therapeutic tool. This study contributes to the broader discourse on the viability and optimization of VR technologies in clinical settings, offering a methodologically sound approach to the practicality and accessibility of exposure therapies for anxiety disorders.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingyi Gu, Jiajia Dai, Jiazhou Chen, Ke Yan, Jing Huang
{"title":"Real-time simulation of thin-film interference with surface thickness variation using the shallow water equations","authors":"Mingyi Gu, Jiajia Dai, Jiazhou Chen, Ke Yan, Jing Huang","doi":"10.1002/cav.2289","DOIUrl":"https://doi.org/10.1002/cav.2289","url":null,"abstract":"<p>Thin-film interference is a significant optical phenomenon. In this study, we employed the transfer matrix method to pre-calculate the reflectance of thin-films at visible light wavelengths. The reflectance is saved as a texture through color space transformation. This advancement has made real-time rendering of thin-film interference feasible. Furthermore, we proposed the implementation of shallow water equations to simulate the morphological evolution of liquid thin-films. This approach facilitates the interpretation and prediction of behaviors and thickness variations in liquid thin-films. We also introduced a viscosity term into the shallow water equations to more accurately simulate the behavior of thin-films, thus facilitating the creation of authentic interference patterns.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141966580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yong Zhang, Yuqing Zhang, Lufei Chen, Baocai Yin, Yongliang Sun
{"title":"Frontal person image generation based on arbitrary-view human images","authors":"Yong Zhang, Yuqing Zhang, Lufei Chen, Baocai Yin, Yongliang Sun","doi":"10.1002/cav.2234","DOIUrl":"10.1002/cav.2234","url":null,"abstract":"<p>Frontal person images contain the richest detailed features of humans, which can effectively assist in behavioral recognition, virtual dress fitting and other applications. While many remarkable networks are devoted to the person image generation task, most of them need accurate target poses as the network inputs. However, the target pose annotation is difficult and time-consuming. In this work, we proposed a first frontal person image generation network based on the proposed anchor pose set and the generative adversarial network. Specifically, our method first classify a rough frontal pose to the input human image based on the proposed anchor pose set, and regress all key points of the rough frontal pose to estimate an accurate frontal pose. Then, we consider the estimated frontal pose as the target pose, and construct a two-stream generator based on the generative adversarial network to update the person's shape and appearance feature in a crossing way and generate a realistic frontal person image. Experiments on the challenging CMU Panoptic dataset show that our method can generate realistic frontal images from arbitrary-view human images.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoping Wang, Xiaokun Wang, Yanrui Xu, Yalan Zhang, Chao Yao, Yu Guo, Xiaojuan Ban
{"title":"Peridynamic-based modeling of elastoplasticity and fracture dynamics","authors":"Haoping Wang, Xiaokun Wang, Yanrui Xu, Yalan Zhang, Chao Yao, Yu Guo, Xiaojuan Ban","doi":"10.1002/cav.2242","DOIUrl":"https://doi.org/10.1002/cav.2242","url":null,"abstract":"<p>This paper introduces a particle-based framework for simulating the behavior of elastoplastic materials and the formation of fractures, grounded in Peridynamic theory. Traditional approaches, such as the Finite Element Method (FEM) and Smoothed Particle Hydrodynamics (SPH), to modeling elastic materials have primarily relied on discretization techniques and continuous constitutive model. However, accurately capturing fracture and crack development in elastoplastic materials poses significant challenges for these conventional models. Our approach integrates a Peridynamic-based elastic model with a density constraint, enhancing stability and realism. We adopt the Von Mises yield criterion and a bond stretch criterion to simulate plastic deformation and fracture formation, respectively. The proposed method stabilizes the elastic model through a density-based position constraint, while plasticity is modeled using the Von Mises yield criterion within the bond of particle paris. Fracturing and the generation of fine fragments are facilitated by the fracture criterion and the application of complementarity operations to the inter-particle connections. Our experimental results demonstrate the efficacy of our framework in realistically depicting a wide range of material behaviors, including elasticity, plasticity, and fracturing, across various scenarios.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPSwap: High-resolution face swapping based on StyleGAN prior","authors":"Dongjin Huang, Chuanman Liu, Jinhua Liu","doi":"10.1002/cav.2238","DOIUrl":"https://doi.org/10.1002/cav.2238","url":null,"abstract":"<p>Existing high-resolution face-swapping works are still challenges in preserving identity consistency while maintaining high visual quality. We present a novel high-resolution face-swapping method GPSwap, which is based on StyleGAN prior. To better preserves identity consistency, the proposed facial feature recombination network fully leverages the properties of both <i>w</i> space and encoders to decouple identities. Furthermore, we presents the image reconstruction module aligns and blends images in <i>FS</i> space, which further supplements facial details and achieves natural blending. It not only improves image resolution but also optimizes visual quality. Extensive experiments and user studies demonstrate that GPSwap is superior to state-of-the-art high-resolution face-swapping methods in terms of image quality and identity consistency. In addition, GPSwap saves nearly 80% of training costs compared to other high-resolution face-swapping works.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiannan Ye, Xiaoxu Meng, Daiyun Guo, Cheng Shang, Haotian Mao, Xubo Yang
{"title":"Neural foveated super-resolution for real-time VR rendering","authors":"Jiannan Ye, Xiaoxu Meng, Daiyun Guo, Cheng Shang, Haotian Mao, Xubo Yang","doi":"10.1002/cav.2287","DOIUrl":"https://doi.org/10.1002/cav.2287","url":null,"abstract":"<p>As virtual reality display technologies advance, resolutions and refresh rates continue to approach human perceptual limits, presenting a challenge for real-time rendering algorithms. Neural super-resolution is promising in reducing the computation cost and boosting the visual experience by scaling up low-resolution renderings. However, the added workload of running neural networks cannot be neglected. In this article, we try to alleviate the burden by exploiting the foveated nature of the human visual system, in a way that we upscale the coarse input in a heterogeneous manner instead of uniform super-resolution according to the visual acuity decreasing rapidly from the focal point to the periphery. With the help of dynamic and geometric information (i.e., pixel-wise motion vectors, depth, and camera transformation) available inherently in the real-time rendering content, we propose a neural accumulator to effectively aggregate the amortizedly rendered low-resolution visual information from frame to frame recurrently. By leveraging a partition-assemble scheme, we use a neural super-resolution module to upsample the low-resolution image tiles to different qualities according to their perceptual importance and reconstruct the final output adaptively. Perceptually high-fidelity foveated high-resolution frames are generated in real-time, surpassing the quality of other foveated super-resolution methods.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 4","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and development of a mixed reality teaching systems for IV cannulation and clinical instruction","authors":"Wei Xiong, Yingda Peng","doi":"10.1002/cav.2288","DOIUrl":"https://doi.org/10.1002/cav.2288","url":null,"abstract":"<p>Intravenous cannulation (IV) is a common technique used in clinical infusion. This study developed a mixed reality IV cannulation teaching system based on the Hololens2 platform. The paper integrates cognitive-affective theory of learning with media (CATLM) and investigates the cognitive engagement and willingness to use the system from the learners' perspective. Through experimental research on 125 subjects, the variables affecting learners' cognitive engagement and intention to use were determined. On the basis of CATLM, three new mixed reality attributes, immersion, system verisimilitude, and response time, were introduced, and their relationships with cognitive participation and willingness to use were determined. The results show that high immersion of mixed reality technology promotes students' higher cognitive engagement; however, this high immersion does not significantly affect learners' intention to use mixed reality technology for learning. Overall, cognitive and emotional theories are effective in mixed reality environments, and the model has good adaptability. This study provides a reference for the application of mixed reality technology in medical education.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141329416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mastering broom-like tools for object transportation animation using deep reinforcement learning","authors":"Guan-Ting Liu, Sai-Keung Wong","doi":"10.1002/cav.2255","DOIUrl":"https://doi.org/10.1002/cav.2255","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we propose a deep reinforcement-based approach to generate an animation of an agent using a broom-like tool to transport a target object. The tool is attached to the agent. So when the agent moves, the tool moves as well.The challenge is to control the agent to move and use the tool to push the target while avoiding obstacles. We propose a direction sensor to guide the agent's movement direction in environments with static obstacles. Furthermore, different rewards and a curriculum learning are implemented to make the agent efficiently learn skills for manipulating the tool. Experimental results show that the agent can naturally control the tool with different shapes to transport target objects. The result of ablation tests revealed the impacts of the rewards and some state components.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141326754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}