Computer Animation and Virtual Worlds最新文献_第4页

SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt 基于Swin变压器和ConvNeXt的双支路强噪声图像去噪网络

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-06-03 DOI: 10.1002/cav.70030

Chuchao Lin, Changjun Zou, Hangbin Xu

{"title":"SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt","authors":"Chuchao Lin, Changjun Zou, Hangbin Xu","doi":"10.1002/cav.70030","DOIUrl":"https://doi.org/10.1002/cav.70030","url":null,"abstract":"<div>\u0000 \u0000 <p>Image denoising plays a vital role in restoring high-quality images from noisy inputs and directly impacts downstream vision tasks. Traditional methods often fail under strong noise, causing detail loss or excessive smoothing. While recent Convolutional Neural Networks-based and Transformer-based models have shown progress, they struggle to jointly capture global structure and preserve local details. To address this, we propose SCNet, a dual-branch fusion network tailored for strong-noise denoising. It combines a Swin Transformer branch for global context modeling and a ConvNeXt branch for fine-grained local feature extraction. Their outputs are adaptively merged via a Feature Fusion Block using joint spatial and channel attention, ensuring semantic consistency and texture fidelity. A multi-scale upsampling module and the Charbonnier loss further improve structural accuracy and visual quality. Extensive experiments on four benchmark datasets show that SCNet outperforms state-of-the-art methods, especially under severe noise, and proves effective in real-world tasks such as mural image restoration.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AIKII: An AI-Enhanced Knowledge Interactive Interface for Knowledge Representation in Educational Games AIKII：用于教育游戏中知识表示的ai增强知识交互界面

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-06-02 DOI: 10.1002/cav.70052

Dake Liu, Huiwen Zhao, Wen Tang, Wenwen Yang

引用次数: 0

DTGS: Defocus-Tolerant View Synthesis Using Gaussian Splatting 使用高斯飞溅的容散焦视图合成

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-06-02 DOI: 10.1002/cav.70045

Xinying Dai, Li Yao

引用次数: 0

Joint-Learning: A Robust Segmentation Method for 3D Point Clouds Under Label Noise 联合学习：标签噪声下三维点云的鲁棒分割方法

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-06-01 DOI: 10.1002/cav.70038

Mengyao Zhang, Jie Zhou, Tingyun Miao, Yong Zhao, Xin Si, Jingliang Zhang

{"title":"Joint-Learning: A Robust Segmentation Method for 3D Point Clouds Under Label Noise","authors":"Mengyao Zhang, Jie Zhou, Tingyun Miao, Yong Zhao, Xin Si, Jingliang Zhang","doi":"10.1002/cav.70038","DOIUrl":"https://doi.org/10.1002/cav.70038","url":null,"abstract":"<div>\u0000 \u0000 <p>Most of point cloud segmentation methods are based on clean datasets and are easily affected by label noise. We present a novel method called Joint-learning, which is the first attempt to apply a dual-network framework to point cloud segmentation with noisy labels. Two networks are trained simultaneously, and each network selects clean samples to update its peer network. The communication between two networks is able to exchange the knowledge they learned, possessing good robustness and generalization ability. Subsequently, adaptive sample selection is proposed to maximize the learning capacity. When the accuracies of both networks are no longer improving, the selection rate is reduced, which results in cleaner selected samples. To further reduce the impact of noisy labels, for unselected samples, we provide a joint label correction algorithm to rectify their labels via two networks' predictions. We conduct various experiments on S3DIS and ScanNet-v2 datasets under different types and rates of noises. Both quantitative and qualitative results verify the reasonableness and effectiveness of the proposed method. By comparison, our method is substantially superior to the state-of-the-art methods and achieves the best results in all noise settings. The average performance improvement is more than 7.43%, with a maximum of 11.42%.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144190905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Talking Face Generation With Lip and Identity Priors 有嘴唇和身份先验的说话面孔一代

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-28 DOI: 10.1002/cav.70026

Jiajie Wu, Frederick W. B. Li, Gary K. L. Tam, Bailin Yang, Fangzhe Nan, Jiahao Pan

{"title":"Talking Face Generation With Lip and Identity Priors","authors":"Jiajie Wu, Frederick W. B. Li, Gary K. L. Tam, Bailin Yang, Fangzhe Nan, Jiahao Pan","doi":"10.1002/cav.70026","DOIUrl":"https://doi.org/10.1002/cav.70026","url":null,"abstract":"<div>\u0000 \u0000 <p>Speech-driven talking face video generation has attracted growing interest in recent research. While person-specific approaches yield high-fidelity results, they require extensive training data from each individual speaker. In contrast, general-purpose methods often struggle with accurate lip synchronization, identity preservation, and natural facial movements. To address these limitations, we propose a novel architecture that combines an alignment model with a rendering model. The rendering model synthesizes identity-consistent lip movements by leveraging facial landmarks derived from speech, a partially occluded target face, multi-reference lip features, and the input audio. Concurrently, the alignment model estimates optical flow using the occluded face and a static reference image, enabling precise alignment of facial poses and lip shapes. This collaborative design enhances the rendering process, resulting in more realistic and identity-preserving outputs. Extensive experiments demonstrate that our method significantly improves lip synchronization and identity retention, establishing a new benchmark in talking face video generation.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144148317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Precise Motion Inbetweening via Bidirectional Autoregressive Diffusion Models 通过双向自回归扩散模型的精确运动中间

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-28 DOI: 10.1002/cav.70040

Jiawen Peng, Zhuoran Liu, Jingzhong Lin, Gaoqi He

引用次数: 0

PG-VTON: Front-And-Back Garment Guided Panoramic Gaussian Virtual Try-On With Diffusion Modeling PG-VTON：前后服装引导全景高斯虚拟试戴扩散建模

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-27 DOI: 10.1002/cav.70054

Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou

{"title":"PG-VTON: Front-And-Back Garment Guided Panoramic Gaussian Virtual Try-On With Diffusion Modeling","authors":"Jian Zheng, Shengwei Sang, Yifei Lu, Guojun Dai, Xiaoyang Mao, Wenhui Zhou","doi":"10.1002/cav.70054","DOIUrl":"https://doi.org/10.1002/cav.70054","url":null,"abstract":"<div>\u0000 \u0000 <p>Virtual try-on (VTON) technology enables the rapid creation of realistic try-on experiences, which makes it highly valuable for the metaverse and e-commerce. However, 2D VTON methods struggle to convey depth and immersion, while existing 3D methods require multi-view garment images and face challenges in generating high-fidelity garment textures. To address the aforementioned limitations, this paper proposes a panoramic Gaussian VTON framework guided solely by front-and-back garment information, named PG-VTON, which uses an adapted local controllable diffusion model for generating virtual dressing effects in specific regions. Specifically, PG-VTON adopts a coarse-to-fine architecture consisting of two stages. The coarse editing stage employs a local controllable diffusion model with a score distillation sampling (SDS) loss to generate coarse garment geometries with high-level semantics. Meanwhile, the refinement stage applies the same diffusion model with a photometric loss not only to enhance garment details and reduce artifacts but also to correct unwanted noise and distortions introduced during the coarse stage, thereby effectively enhancing realism. To improve training efficiency, we further introduce a dynamic noise scheduling (DNS) strategy, which ensures stable training and high-fidelity results. Experimental results demonstrate the superiority of our method, which achieves geometrically consistent and highly realistic 3D virtual try-on generation.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144148302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Robust 3D Mesh Segmentation Algorithm With Anisotropic Sparse Embedding 基于各向异性稀疏嵌入的鲁棒三维网格分割算法

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-27 DOI: 10.1002/cav.70042

Mengyao Zhang, Wenting Li, Yong Zhao, Xin Si, Jingliang Zhang

{"title":"A Robust 3D Mesh Segmentation Algorithm With Anisotropic Sparse Embedding","authors":"Mengyao Zhang, Wenting Li, Yong Zhao, Xin Si, Jingliang Zhang","doi":"10.1002/cav.70042","DOIUrl":"https://doi.org/10.1002/cav.70042","url":null,"abstract":"<div>\u0000 \u0000 <p>3D mesh segmentation, as a very challenging problem in computer graphics, has attracted considerable interest. The most popular methods in recent years are data-driven methods. However, such methods require a large amount of accurately labeled data, which is difficult to obtain. In this article, we propose a novel mesh segmentation algorithm based on anisotropic sparse embedding. We first over-segment the input mesh and get a collection of patches. Then these patches are embedded into a latent space via an anisotropic <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow>\u0000 <mi>L</mi>\u0000 </mrow>\u0000 <mrow>\u0000 <mn>1</mn>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {L}_1 $$</annotation>\u0000 </semantics></math>-regularized optimization problem. In the new space, the patches that belong to the same part of the mesh will be closer, while those belonging to different parts will be farther. Finally, we can easily generate the segmentation result by clustering. Various experimental results on the PSB and COSEG datasets show that our algorithm is able to get perception-aware results and is superior to the state-of-the-art algorithms. In addition, the proposed algorithm can robustly deal with meshes with different poses, different triangulations, noises, missing regions, or missing parts.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144148303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UTMCR: 3U-Net Transformer With Multi-Contrastive Regularization for Single Image Dehazing UTMCR：用于单幅图像去雾的多对比正则化3U-Net变压器

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-26 DOI: 10.1002/cav.70029

HangBin Xu, ChangJun Zou, ChuChao Lin

{"title":"UTMCR: 3U-Net Transformer With Multi-Contrastive Regularization for Single Image Dehazing","authors":"HangBin Xu, ChangJun Zou, ChuChao Lin","doi":"10.1002/cav.70029","DOIUrl":"https://doi.org/10.1002/cav.70029","url":null,"abstract":"<div>\u0000 \u0000 <p>Convolutional neural networks have a long history of development in single-width dehazing tasks, but have gradually been dominated by the Transformer framework due to their insufficient global modeling capability and large number of parameters. However, the existing Transformer network structure adopts a single U-Net structure, which is insufficient in multi-level and multi-scale feature fusion and modeling capability. Therefore, we propose an end-to-end dehazing network (UTMCR-Net). The network consists of two parts: (1) UT module, which connects three U-Net networks in series, where the backbone is replaced by the Dehazeformer block. By connecting three U-Net networks in series, we can improve the image global modeling capability and capture multi-scale information at different levels to achieve multi-level and multi-scale feature fusion. (2) MCR module, which improves the original contrastive regularization method by splitting the results of the UT module into four equal blocks, which are then compared and learned by using the contrast regularization method, respectively. Specifically, we use three U-Net networks to enhance the global modeling capability of UTMCR as well as the multi-scale feature fusion capability. The image dehazing ability is further enhanced using the MCR module. Experimental results show that our method achieves better results on most datasets.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decoupling Density Dynamics: A Neural Operator Framework for Adaptive Multi-Fluid Interactions 解耦密度动力学：自适应多流体相互作用的神经算子框架

IF 0.9 4区计算机科学

Computer Animation and Virtual Worlds Pub Date : 2025-05-26 DOI: 10.1002/cav.70027

Yalan Zhang, Yuhang Xu, Xiaokun Wang, Angelos Chatzimparmpas, Xiaojuan Ban

引用次数: 0