{"title":"Augmented Reality-Based Interactive Scheme for Robot-Assisted Percutaneous Renal Puncture Navigation","authors":"Yiwei Zhuang, Shuyi Wang, Hua Xie, Wei Qing, Haoliang Li, Yuhan Shen, Yichun Shen","doi":"10.1002/cav.70009","DOIUrl":"https://doi.org/10.1002/cav.70009","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we present an Augmented Reality (AR)-based application combined with a robotic system for percutaneous renal puncture navigation interaction and demonstrate its technical feasibility. Our system provides an intuitive interaction scheme between the surgeon and the robot without the need for traditional external input devices, and applies an image-target-based 3D registration scheme to transform the coordinate system between Hololens2 and the robot without using additional tracking devices. Users can visualize the abdominal puncture phantom and obtain 3D depth information of the lesion site by wearing Hololens2 and control the robot directly using buttons or gestures. To investigate the accuracy and feasibility of the proposed interaction scheme, six subjects were recruited to complete 3D registration alignment accuracy experiments, and puncture positioning accuracy experiments using ultrasound unaided navigation, AR unaided navigation and AR robotic navigation. The results showed that the average alignment error of 3D registration was 3.61 ± 1.05 mm. The average positioning errors of ultrasound freehand navigation, AR freehand navigation and AR robotic navigation were 7.67 ± 2.00 mm, 6.13 ± 1.07 mm and 5.52 ± 0.37 mm, respectively; the average puncture times were 34.86 ± 1.67 s, 22.40 ± 2.07 s, and 29.41 ± 1.37 s.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143362755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced Gesture Recognition Method Based on Fractional Fourier Transform and Relevance Vector Machine for Smart Home Appliances","authors":"Xie Hong-qin, Zhao Yuan-yuan","doi":"10.1002/cav.70011","DOIUrl":"https://doi.org/10.1002/cav.70011","url":null,"abstract":"<div>\u0000 \u0000 <p>Addressing the challenges of low feature extraction dimensions and insufficient distinct information for gesture differentiation for smart home appliances, this article proposed an innovative gesture recognition algorithm, integrating fractional Fourier transform (FrFT) with relevance vector machine (RVM). The process involves using FrFT to transform raw gesture data into the fractional domain, thereby expanding the dimensions of information extraction. Subsequently, high-dimensional feature vectors are created from fractional domain, and RVM classifiers are employed for joint optimization of feature selection and classification decision functions, achieving optimal classification performance. A dataset was constructed using five different types of gestures recorded on the TI millimeter-wave radar platform to validate the effectiveness of this method. The experimental results demonstrate that the RVM selected the optimal FrFT order of 0.6, with the best feature set comprising fractional spectral entropy, peak factor, and second-order central moment. Recognition rates for each gesture exceeded 96.2%, with an average rate of 98.5%. This performance surpasses three comparative methods in both recognition accuracy and real-time processing, indicating high potential for future applications.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual Expansion and Real-Time Calibration for Pan-Tilt-Zoom Cameras Assisted by Panoramic Models","authors":"Liangliang Cai, Zhong Zhou","doi":"10.1002/cav.70015","DOIUrl":"https://doi.org/10.1002/cav.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>Pan-tilt-zoom (PTZ) cameras, which dynamically adjust their field of view (FOV), are pervasive in large-scale scenes, such as train stations, squares, and airports. In real scenarios, PTZ cameras are required to quickly judge their directions using contextual clues from the surrounding environment. To achieve this goal, some research projects camera videos into three-dimensional (3D) models or panoramas and allows operators to establish spatial relationships. However, these works face several challenges in terms of real-time processing, localization accuracy, and realistic reference. To address this problem, a visual expansion and real-time calibration for PTZ cameras assisted by panoramic models is proposed. The calibration method consists of three parts: Providing a real environment background by building a panoramic model, meeting the needs of real-time processing by establishing a PTZ camera motion estimation model and achieving high-precision alignment between PTZ images and panoramic models using only two feature point pairs. Our methods were validated using both the public and our Scene dataset. The experimental results indicate that our method outperforms other state-of-the-art methods in terms of real-time processing, accuracy, and robustness.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Creating an Anthropomorphic Folktale Animal: A Pilot Study on Character Design Creativity Derived From Autonomous Behavior Generation Powered by Reinforcement Learning","authors":"Hongju Yang, Seung Wan Hong","doi":"10.1002/cav.70013","DOIUrl":"https://doi.org/10.1002/cav.70013","url":null,"abstract":"<div>\u0000 \u0000 <p>Popular in fantasy films, games, and extended reality, anthropomorphic animals often rely on animator creativity and real animal observation for behavior visualization. This artistic approach captures emotional traits but lacks uncovering diverse, unanticipated behaviors beyond creators' concepts. To enrich character design, this study employs reinforcement learning (RL) agent simulation to explore the autonomous behavior and unexpected responses of the nine-tailed Fox Sister from Korean folklore. As a method, the agent, with a physics-based controller and skeletal joints, uses hybrid action control to transition between bipedal and quadrupedal actions based on the environment. In result, RL character frequently exhibits behavioral shifts, including unexpected actions in response to training steps and terrain complexities like slopes and hurdles, distinguishing them from animation-based finite-state machines. Additionally, this study validates impacts of RL character on character design creativity. To investigate such unknown impacts, this study conducts a comparative pilot study that recruits five character designers under use and nonuse scenario of RL character. Analysis indicates that RL character promotes creativity of character design, conceptualization, and development of scenario and character's attribute. This study highlights RL's potential for visualizing diverse inspirational behaviors of folkloric creatures by simulating interactions between body structure, motion, and environment.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multi-Model Approach for Attention Prediction in Gaming Environments for Autistic Children","authors":"P. Valarmathi, A. Packialatha","doi":"10.1002/cav.70010","DOIUrl":"https://doi.org/10.1002/cav.70010","url":null,"abstract":"<div>\u0000 \u0000 <p>Autism spectrum disorder (ASD) is a neurological condition that affects an individual's mental development. This research work implements a multimodality input-based virtual reality (VR)-enabled attention prediction approach in gaming for children with autism. Initially, the multimodal inputs such as face image, electroencephalogram (EEG) signal, and data are individually processed by both the preprocessing and feature extraction procedures. Subsequently, a hybrid classification model with classifiers such as improved deep convolutional neural network (IDCNN) and long short term memory (LSTM) is utilized in expression detection by concatenating the resultant features obtained from the feature extraction procedure. Here, the conventional deep convolutional neural network (DCNN) approach is improved by a novel block-knowledge-based processing with a proposed sine-hinge loss function. Finally, an improved weighted mutual information process is employed in attention prediction. Moreover, this proposed attention prediction model is analyzed by simulation and experimental analyses. The effectiveness of the proposed model is significantly proved by the experimental results obtained from various analyses.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143120500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WDANet: Exploring Stylized Animation via Diffusion Model for Woodcut-Style Design","authors":"Yangchunxue Ou, Jingjun Xu","doi":"10.1002/cav.70007","DOIUrl":"https://doi.org/10.1002/cav.70007","url":null,"abstract":"<div>\u0000 \u0000 <p>Stylized animation strives for innovation and bold visual creativity. Integrating the inherent strong visual impact and color contrast of woodcut style into such animations is both appealing and challenging, especially during the design phase. Traditional woodcut methods, hand-drawing, and previous computer-aided techniques face challenges such as dwindling design inspiration, lengthy production times, and complex adjustment procedures. To address these issues, we propose a novel network framework, the Woodcut-style Design Assistant Network (WDANet). Our research is the first to use diffusion models to streamline the woodcut-style design process. We curate the Woodcut-62 dataset, which features works from 62 renowned historical artists, to train WDANet in capturing and learning the aesthetic nuances of woodcut prints. WDANet, based on the denoising U-Net, effectively decouples content and style features. It allows users to input or slightly modify a text description to quickly generate accurate, high-quality woodcut-style designs, saving time and offering flexibility. Quantitative and qualitative analyses, along with user studies, confirm that WDANet outperforms current state-of-the-art methods in generating woodcut-style images, demonstrating its value as a design aid.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143113147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Novel View Synthesis Based on Similar Perspective","authors":"Wenkang Huang","doi":"10.1002/cav.70006","DOIUrl":"https://doi.org/10.1002/cav.70006","url":null,"abstract":"<div>\u0000 \u0000 <p>Neural radiance fields (NeRF) technology has garnered significant attention due to its exceptional performance in generating high-quality novel view images. In this study, we propose an innovative method that leverages the similarity between views to enhance the quality of novel view image generation. Initially, a pre-trained NeRF model generates an initial novel view image, which is subsequently compared and subjected to feature transfer with the most similar reference view from the training dataset. Following this, the reference view that is most similar to the initial novel view is selected from the training dataset. We designed a texture transfer module that employs a strategy progressing from coarse-to-fine, effectively integrating salient features from the reference view into the initial image, thus producing more realistic novel view images. By using similar views, this approach not only improves the quality of novel perspective images but also incorporates the training dataset as a dynamic information pool into the novel view integration process. This allows for the continuous acquisition and utilization of useful information from the training data throughout the synthesis process. Extensive experimental validation shows that using similar views to provide scene information significantly outperforms existing neural rendering techniques in enhancing the realism and accuracy of novel view images.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Body Part Segmentation of Anime Characters","authors":"Zhenhua Ou, Xueting Liu, Chengze Li, Zhenkun Wen, Ping Li, Zhijian Gao, Huisi Wu","doi":"10.1002/cav.2295","DOIUrl":"https://doi.org/10.1002/cav.2295","url":null,"abstract":"<div>\u0000 \u0000 <p>Semantic segmentation is an important approach to present the perceptual semantic understanding of an image, which is of significant usage in various applications. Especially, body part segmentation is designed for segmenting body parts of human characters to assist different editing tasks, such as style editing, pose transfer, and animation production. Since segmentation requires pixel-level precision in semantic labeling, classic heuristics-based methods generally have unstable performance. With the deployment of deep learning, a great step has been taken in segmenting body parts of human characters in natural photographs. However, the existing models are purely trained on natural photographs and generally obtain incorrect segmentation results when applied on anime character images, due to the large visual gap between training data and testing data. In this article, we present a novel approach to achieving body part segmentation of cartoon characters via a pose-based graph-cut formulation. We demonstrate the use of the acquired body part segmentation map in various image editing tasks, including conditional generation, style manipulation, pose transfer, and video-to-anime.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast and Incremental 3D Model Renewal for Urban Scenes With Appearance Changes","authors":"Yuan Xiong, Zhong Zhou","doi":"10.1002/cav.70004","DOIUrl":"https://doi.org/10.1002/cav.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Urban 3D models with high-resolution details are the basis of various mixed reality and geographic information systems. Fast and accurate urban reconstruction from aerial photographs has attracted intense attention. Existing methods exploit multi-view geometry information from landscape patterns with similar illumination conditions and terrain appearance. In practice, urban models become obsolete over time due to human activities. Mainstream reconstruction pipelines rebuild the whole scene even if the main part of them remains unchanged. This paper proposes a novel wrapping-based incremental modeling framework to reuse existing models and renew them with new meshes efficiently. The paper illustrates a pose optimization method with illumination-based augmentation and virtual bundle adjustment. Besides, a high-performance wrapping-based meshing method is proposed for fast reconstruction. Experimental results show that the proposed method can achieve higher performance and quality than state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142851361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diverse Motions and Responses in Crowd Simulation","authors":"Yiwen Ma, Tingting Liu, Zhen Liu","doi":"10.1002/cav.70002","DOIUrl":"https://doi.org/10.1002/cav.70002","url":null,"abstract":"<div>\u0000 \u0000 <p>A challenge in crowd simulation is to generate diverse pedestrian motions in virtual environments. Nowadays, there is a greater emphasis on the diversity and authenticity of pedestrian movements in crowd simulation, while most traditional models primarily focus on collision avoidance and motion continuity. Recent studies have enhanced realism through data-driven approaches that exploit the movement patterns of pedestrians from real data for trajectory prediction. However, they have not taken into account the body-part motions of pedestrians. Differing from these approaches, we innovatively utilize learning-based character motion and physics animation to enhance the diversity of pedestrian motions in crowd simulation. The proposed method can provide a promising avenue for more diverse crowds and is realized by a novel framework that deeply integrates motion synthesis and physics animation with crowd simulation. The framework consists of three main components: the learning-based motion generator, which is responsible for generating diverse character motions; the hybrid simulation, which ensures the physical realism of pedestrian motions; and the velocity-based interface, which assists in integrating navigation algorithms with the motion generator. Experiments have been conducted to verify the effectiveness of the proposed method in different aspects. The visual results demonstrate the feasibility of our approach.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 6","pages":""},"PeriodicalIF":0.9,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142737568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}