Computational Visual Media最新文献

筛选
英文 中文
Message from the Best Paper Award Committee 最佳论文奖委员会致辞
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-05-14 DOI: 10.1007/s41095-024-0435-z
Ming C. Lin, Baoquan Chen, Ying He, Wenping Wang, Kun Zhou, Ralph Martin
{"title":"Message from the Best Paper Award Committee","authors":"Ming C. Lin, Baoquan Chen, Ying He, Wenping Wang, Kun Zhou, Ralph Martin","doi":"10.1007/s41095-024-0435-z","DOIUrl":"https://doi.org/10.1007/s41095-024-0435-z","url":null,"abstract":"","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140980993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation models meet visualizations: Challenges and opportunities 基础模型与可视化的结合:挑战与机遇
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-05-02 DOI: 10.1007/s41095-023-0393-x
Weikai Yang, Mengchen Liu, Zheng Wang, Shixia Liu
{"title":"Foundation models meet visualizations: Challenges and opportunities","authors":"Weikai Yang, Mengchen Liu, Zheng Wang, Shixia Liu","doi":"10.1007/s41095-023-0393-x","DOIUrl":"https://doi.org/10.1007/s41095-023-0393-x","url":null,"abstract":"<p>Recent studies have indicated that foundation models, such as BERT and GPT, excel at adapting to various downstream tasks. This adaptability has made them a dominant force in building artificial intelligence (AI) systems. Moreover, a new research paradigm has emerged as visualization techniques are incorporated into these models. This study divides these intersections into two research areas: visualization for foundation model (VIS4FM) and foundation model for visualization (FM4VIS). In terms of VIS4FM, we explore the primary role of visualizations in understanding, refining, and evaluating these intricate foundation models. VIS4FM addresses the pressing need for transparency, explainability, fairness, and robustness. Conversely, in terms of FM4VIS, we highlight how foundation models can be used to advance the visualization field itself. The intersection of foundation models with visualizations is promising but also introduces a set of challenges. By highlighting these challenges and promising opportunities, this study aims to provide a starting point for the continued exploration of this research avenue.</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning layout generation for virtual worlds 为虚拟世界生成学习布局
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-05-02 DOI: 10.1007/s41095-023-0365-1
Weihao Cheng, Ying Shan
{"title":"Learning layout generation for virtual worlds","authors":"Weihao Cheng, Ying Shan","doi":"10.1007/s41095-023-0365-1","DOIUrl":"https://doi.org/10.1007/s41095-023-0365-1","url":null,"abstract":"<p>The emergence of the metaverse has led to the rapidly increasing demand for the generation of extensive 3D worlds. We consider that an engaging world is built upon a rational layout of multiple land-use areas (e.g., forest, meadow, and farmland). To this end, we propose a generative model of land-use distribution that learns from geographic data. The model is based on a transformer architecture that generates a 2D map of the land-use layout, which can be conditioned on spatial and semantic controls, depending on whether either one or both are provided. This model enables diverse layout generation with user control and layout expansion by extending borders with partial inputs. To generate high-quality and satisfactory layouts, we devise a geometric objective function that supervises the model to perceive layout shapes and regularize generations using geometric priors. Additionally, we devise a planning objective function that supervises the model to perceive progressive composition demands and suppress generations deviating from controls. To evaluate the spatial distribution of the generations, we train an autoencoder to embed land-use layouts into vectors to enable comparison between the real and generated data using the Wasserstein metric, which is inspired by the Fréchet inception distance.</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdaPIP: Adaptive picture-in-picture guidance for 360° film watching AdaPIP:自适应画中画引导,360°观影
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-05-02 DOI: 10.1007/s41095-023-0347-3
Yi-Xiao Li, Guan Luo, Yi-Ke Xu, Yu He, Fang-Lue Zhang, Song-Hai Zhang
{"title":"AdaPIP: Adaptive picture-in-picture guidance for 360° film watching","authors":"Yi-Xiao Li, Guan Luo, Yi-Ke Xu, Yu He, Fang-Lue Zhang, Song-Hai Zhang","doi":"10.1007/s41095-023-0347-3","DOIUrl":"https://doi.org/10.1007/s41095-023-0347-3","url":null,"abstract":"<p>360° videos enable viewers to watch freely from different directions but inevitably prevent them from perceiving all the helpful information. To mitigate this problem, picture-in-picture (PIP) guidance was proposed using preview windows to show regions of interest (ROIs) outside the current view range. We identify several drawbacks of this representation and propose a new method for 360° film watching called AdaPIP. AdaPIP enhances traditional PIP by adaptively arranging preview windows with changeable view ranges and sizes. In addition, AdaPIP incorporates the advantage of arrow-based guidance by presenting circular windows with arrows attached to them to help users locate the corresponding ROIs more efficiently. We also adapted AdaPIP and Outside-In to HMD-based immersive virtual reality environments to demonstrate the usability of PIP-guided approaches beyond 2D screens. Comprehensive user experiments on 2D screens, as well as in VR environments, indicate that AdaPIP is superior to alternative methods in terms of visual experiences while maintaining a comparable degree of immersion.</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symmetrization of quasi-regular patterns with periodic tilting of regular polygons 用规则多边形的周期性倾斜对称准规则图案
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-27 DOI: 10.1007/s41095-023-0359-z
Zhengzheng Yin, Yao Jin, Zhijian Fang, Yun Zhang, Huaxiong Zhang, Jiu Zhou, Lili He
{"title":"Symmetrization of quasi-regular patterns with periodic tilting of regular polygons","authors":"Zhengzheng Yin, Yao Jin, Zhijian Fang, Yun Zhang, Huaxiong Zhang, Jiu Zhou, Lili He","doi":"10.1007/s41095-023-0359-z","DOIUrl":"https://doi.org/10.1007/s41095-023-0359-z","url":null,"abstract":"<p>Computer-generated aesthetic patterns are widely used as design materials in various fields. The most common methods use fractals or dynamical systems as basic tools to create various patterns. To enhance aesthetics and controllability, some researchers have introduced symmetric layouts along with these tools. One popular strategy employs dynamical systems compatible with symmetries that construct functions with the desired symmetries. However, these are typically confined to simple planar symmetries. The other generates symmetrical patterns under the constraints of tilings. Although it is slightly more flexible, it is restricted to small ranges of tilings and lacks textural variations. Thus, we proposed a new approach for generating aesthetic patterns by symmetrizing quasi-regular patterns using general <i>k</i>-uniform tilings. We adopted a unified strategy to construct invariant mappings for <i>k</i>-uniform tilings that can eliminate texture seams across the tiling edges. Furthermore, we constructed three types of symmetries associated with the patterns: dihedral, rotational, and reflection symmetries. The proposed method can be easily implemented using GPU shaders and is highly efficient and suitable for complicated tiling with regular polygons. Experiments demonstrated the advantages of our method over state-of-the-art methods in terms of flexibility in controlling the generation of patterns with various parameters as well as the diversity of textures and styles.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint training with local soft attention and dual cross-neighbor label smoothing for unsupervised person re-identification 利用局部软关注和双交叉邻域标签平滑进行联合训练,实现无监督人员再识别
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-27 DOI: 10.1007/s41095-023-0354-4
Qing Han, Longfei Li, Weidong Min, Qi Wang, Qingpeng Zeng, Shimiao Cui, Jiongjin Chen
{"title":"Joint training with local soft attention and dual cross-neighbor label smoothing for unsupervised person re-identification","authors":"Qing Han, Longfei Li, Weidong Min, Qi Wang, Qingpeng Zeng, Shimiao Cui, Jiongjin Chen","doi":"10.1007/s41095-023-0354-4","DOIUrl":"https://doi.org/10.1007/s41095-023-0354-4","url":null,"abstract":"<p>Existing unsupervised person re-identification approaches fail to fully capture the fine-grained features of local regions, which can result in people with similar appearances and different identities being assigned the same label after clustering. The identity-independent information contained in different local regions leads to different levels of local noise. To address these challenges, joint training with local soft attention and dual cross-neighbor label smoothing (DCLS) is proposed in this study. First, the joint training is divided into global and local parts, whereby a soft attention mechanism is proposed for the local branch to accurately capture the subtle differences in local regions, which improves the ability of the re-identification model in identifying a person’s local significant features. Second, DCLS is designed to progressively mitigate label noise in different local regions. The DCLS uses global and local similarity metrics to semantically align the global and local regions of the person and further determines the proximity association between local regions through the cross information of neighboring regions, thereby achieving label smoothing of the global and local regions throughout the training process. In extensive experiments, the proposed method outperformed existing methods under unsupervised settings on several standard person re-identification datasets.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DepthGAN: GAN-based depth generation from semantic layouts DepthGAN:基于 GAN 的语义布局深度生成
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-27 DOI: 10.1007/s41095-023-0350-8
Yidi Li, Jun Xiao, Yiqun Wang, Zhengda Lu
{"title":"DepthGAN: GAN-based depth generation from semantic layouts","authors":"Yidi Li, Jun Xiao, Yiqun Wang, Zhengda Lu","doi":"10.1007/s41095-023-0350-8","DOIUrl":"https://doi.org/10.1007/s41095-023-0350-8","url":null,"abstract":"<p>Existing GAN-based generative methods are typically used for semantic image synthesis. We pose the question of whether GAN-based architectures can generate plausible depth maps and find that existing methods have difficulty in generating depth maps which reasonably represent 3D scene structure due to the lack of global geometric correlations. Thus, we propose DepthGAN, a novel method of generating a depth map using a semantic layout as input to aid construction, and manipulation of well-structured 3D scene point clouds. Specifically, we first build a feature generation model with a cascade of semantically-aware transformer blocks to obtain depth features with global structural information. For our semantically aware transformer block, we propose a mixed attention module and a semantically aware layer normalization module to better exploit semantic consistency for depth features generation. Moreover, we present a novel semantically weighted depth synthesis module, which generates adaptive depth intervals for the current scene. We generate the final depth map by using a weighted combination of semantically aware depth weights for different depth ranges. In this manner, we obtain a more accurate depth map. Extensive experiments on indoor and outdoor datasets demonstrate that DepthGAN achieves superior results both quantitatively and visually for the depth generation task.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Physics-based fluid simulation in computer graphics: Survey, research trends, and challenges 计算机制图中基于物理的流体模拟:调查、研究趋势和挑战
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-27 DOI: 10.1007/s41095-023-0368-y
Xiaokun Wang, Yanrui Xu, Sinuo Liu, Bo Ren, Jiří Kosinka, Alexandru C. Telea, Jiamin Wang, Chongming Song, Jian Chang, Chenfeng Li, Jian Jun Zhang, Xiaojuan Ban
{"title":"Physics-based fluid simulation in computer graphics: Survey, research trends, and challenges","authors":"Xiaokun Wang, Yanrui Xu, Sinuo Liu, Bo Ren, Jiří Kosinka, Alexandru C. Telea, Jiamin Wang, Chongming Song, Jian Chang, Chenfeng Li, Jian Jun Zhang, Xiaojuan Ban","doi":"10.1007/s41095-023-0368-y","DOIUrl":"https://doi.org/10.1007/s41095-023-0368-y","url":null,"abstract":"<p>Physics-based fluid simulation has played an increasingly important role in the computer graphics community. Recent methods in this area have greatly improved the generation of complex visual effects and its computational efficiency. Novel techniques have emerged to deal with complex boundaries, multiphase fluids, gas–liquid interfaces, and fine details. The parallel use of machine learning, image processing, and fluid control technologies has brought many interesting and novel research perspectives. In this survey, we provide an introduction to theoretical concepts underpinning physics-based fluid simulation and their practical implementation, with the aim for it to serve as a guide for both newcomers and seasoned researchers to explore the field of physics-based fluid simulation, with a focus on developments in the last decade. Driven by the distribution of recent publications in the field, we structure our survey to cover physical background; discretization approaches; computational methods that address scalability; fluid interactions with other materials and interfaces; and methods for expressive aspects of surface detail and control. From a practical perspective, we give an overview of existing implementations available for the above methods.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140809331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to compose diversified prompts for image emotion classification 学习编写用于图像情感分类的多样化提示语
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-26 DOI: 10.1007/s41095-023-0389-6
Sinuo Deng, Lifang Wu, Ge Shi, Lehao Xing, Meng Jian, Ye Xiang, Ruihai Dong
{"title":"Learning to compose diversified prompts for image emotion classification","authors":"Sinuo Deng, Lifang Wu, Ge Shi, Lehao Xing, Meng Jian, Ye Xiang, Ruihai Dong","doi":"10.1007/s41095-023-0389-6","DOIUrl":"https://doi.org/10.1007/s41095-023-0389-6","url":null,"abstract":"<p>Image emotion classification (IEC) aims to extract the abstract emotions evoked in images. Recently, language-supervised methods such as contrastive language-image pretraining (CLIP) have demonstrated superior performance in image understanding. However, the underexplored task of IEC presents three major challenges: a tremendous training objective gap between pretraining and IEC, shared suboptimal prompts, and invariant prompts for all instances. In this study, we propose a general framework that effectively exploits the language-supervised CLIP method for the IEC task. First, a prompt-tuning method that mimics the pretraining objective of CLIP is introduced, to exploit the rich image and text semantics associated with CLIP. Subsequently, instance-specific prompts are automatically composed, conditioning them on the categories and image content of instances, diversifying the prompts, and thus avoiding suboptimal problems. Evaluations on six widely used affective datasets show that the proposed method significantly outperforms state-of-the-art methods (up to 9.29% accuracy gain on the EmotionROI dataset) on IEC tasks with only a few trained parameters. The code is publicly available at https://github.com/dsn0w/PT-DPC/for research purposes.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140883709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTSN: Predicting cloth deformation for skeleton-based characters with a two-stream skinning network CTSN:利用双流皮肤网络预测基于骨骼的角色的布料变形
IF 6.9 3区 计算机科学
Computational Visual Media Pub Date : 2024-04-19 DOI: 10.1007/s41095-023-0344-6
Yudi Li, Min Tang, Yun Yang, Ruofeng Tong, Shuangcai Yang, Yao Li, Bailin An, Qilong Kou
{"title":"CTSN: Predicting cloth deformation for skeleton-based characters with a two-stream skinning network","authors":"Yudi Li, Min Tang, Yun Yang, Ruofeng Tong, Shuangcai Yang, Yao Li, Bailin An, Qilong Kou","doi":"10.1007/s41095-023-0344-6","DOIUrl":"https://doi.org/10.1007/s41095-023-0344-6","url":null,"abstract":"<p>We present a novel learning method using a two-stream network to predict cloth deformation for skeleton-based characters. The characters processed in our approach are not limited to humans, and can be other targets with skeleton-based representations such as fish or pets. We use a novel network architecture which consists of skeleton-based and mesh-based residual networks to learn the coarse features and wrinkle features forming the overall residual from the template cloth mesh. Our network may be used to predict the deformation for loose or tight-fitting clothing. The memory footprint of our network is low, thereby resulting in reduced computational requirements. In practice, a prediction for a single cloth mesh for a skeleton-based character takes about 7 ms on an nVidia GeForce RTX 3090 GPU. Compared to prior methods, our network can generate finer deformation results with details and wrinkles.\u0000</p>","PeriodicalId":37301,"journal":{"name":"Computational Visual Media","volume":null,"pages":null},"PeriodicalIF":6.9,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140629637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信