Tianyu Zhang, Xiaoxuan Xie, Xusheng Du, Haoran Xie
{"title":"Sketch-guided scene image generation with diffusion model","authors":"Tianyu Zhang, Xiaoxuan Xie, Xusheng Du, Haoran Xie","doi":"10.1016/j.cag.2025.104226","DOIUrl":"10.1016/j.cag.2025.104226","url":null,"abstract":"<div><div>Text-to-image models showcase the impressive ability to generate high-quality and diverse images. However, the transition from freehand sketches to complex scene images with multiple objects remains challenging in computer graphics. In this study, we propose a novel sketch-guided scene image generation framework, decomposing the task of scene image generation from sketch inputs into object-level cross-domain generation and scene-level image construction steps. We first employ a pre-trained diffusion model to convert each single object drawing into a separate image, which can infer additional image details while maintaining the sparse sketch structure. To preserve the conceptual fidelity of the foreground during scene generation, we invert the visual features of object images into identity embeddings for scene generation. For scene-level image construction, we generate the latent representation of the scene image using the separated background prompts. Then, we blend the generated foreground objects with the background image guided by the layout of sketch inputs. We infer the scene image on the blended latent representation using a global prompt with the trained identity tokens to ensure the foreground objects’ details remain unchanged while naturally composing the scene image. Through qualitative and quantitative experiments, we demonstrated that the proposed method’s ability surpasses the state-of-the-art approaches for scene image generation from hand-drawn sketches.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"129 ","pages":"Article 104226"},"PeriodicalIF":2.5,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher Palazzolo, Oliver van Kaick, David Mould
{"title":"Breaking art: Synthesizing abstract expressionism through image rearrangement","authors":"Christopher Palazzolo, Oliver van Kaick, David Mould","doi":"10.1016/j.cag.2025.104224","DOIUrl":"10.1016/j.cag.2025.104224","url":null,"abstract":"<div><div>We present an algorithm that creates interesting abstract expressionist images from segments of an input image. The algorithm operates by first segmenting the input image at multiple scales, then redistributing the resulting segments across the image plane to obtain an aesthetic abstract output. Larger segments are placed using neighborhood-aware descriptors, and smaller segments are arranged in a Poisson disk distribution. In our thorough analysis, we show that our results score highly according to several relevant aesthetic metrics, and that our style is indeed abstract expressionism. The results are visually appealing, provided the exemplar has a somewhat diverse color pallette and some amount of structure.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"129 ","pages":"Article 104224"},"PeriodicalIF":2.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143878764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inbetweening with occlusions for non-linear rough 2D animation","authors":"Melvin Even, Pierre Bénard, Pascal Barla","doi":"10.1016/j.cag.2025.104223","DOIUrl":"10.1016/j.cag.2025.104223","url":null,"abstract":"<div><div>Representing 3D motion and depth through 2D animated drawings is a notoriously difficult task, requiring time and expertise when done by hand. Artists must pay particular attention to occlusions and how they evolve through time, a tedious process. Computer-assisted inbetweening methods such as cut-out animation tools allow for such occlusions to be handled beforehand using a 2D rig, at the expense of flexibility and artistic expression.</div><div>In this work, we extend the more flexible 2D animation framework of Even et al., (2023) to handle occlusions. We do so by retaining three key properties of their system that are crucial to speed-up the animation process: input rough drawings, real-time preview, and non-linear animation editing. Our contribution is two-fold: a fast method to compute 2D masks from rough drawings with a semi-automatic dynamic layout system for occlusions between drawing parts; and a method to both automatically and manually control the dynamic visibility of strokes for self-occlusions. Such controls are not available in any traditional 2D animation software especially with rough drawings. Our system helps artists produce convincing 3D-like 2D animations, including head turns, foreshortening effects, out-of-plane rotations, overlapping volumes and even transparency.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"129 ","pages":"Article 104223"},"PeriodicalIF":2.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring high-contrast areas context for 3D point cloud segmentation via MLP-driven Discrepancy mechanism","authors":"Yuyuan Shao , Guofeng Tong , Hao Peng","doi":"10.1016/j.cag.2025.104222","DOIUrl":"10.1016/j.cag.2025.104222","url":null,"abstract":"<div><div>Recent advancements in 3D point cloud segmentation, such as PointNext and PointVector, revisit the concise PointNet++ architecture. However, these networks struggle to capture sufficient contextual features in significant high-contrast areas. To address this, we propose a High-contrast Global Context Reasoning (HGCR) module and a Self-discrepancy Attention Encoding (SDAE) block to explore the global and local context in high-contrast regions, respectively. Specifically, HGCR leverages an MLP-driven Discrepancy (MLPD) mechanism and a Mean-pooling function to promote long-range information interactions between high-contrast areas and 3D scene. SDAE expands the degree of freedom of attention weights using an MLP-driven Self-discrepancy (MLPSD) strategy, enabling the extraction of discriminating local context in adjacent high-contrast areas. Finally, we propose a deep network called redPointHC, which follows the architecture of PointNext and PointVector. Our PointHC achieves a mIoU of 74.3% on S3DIS (Area 5), delivering superior performance compared to recent methods, surpassing PointNext by 3.5% and PointVector by 2.0%, while using fewer parameters (22.4M). Moreover, we demonstrate competitive performance with mIoU of 79.8% on S3DIS (6-fold cross-validation), improving upon PointNext by 4.9% and PointVector by 1.4%. Code is available at <span><span>https://github.com/ShaoyuyuanNEU/PointHC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"129 ","pages":"Article 104222"},"PeriodicalIF":2.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EMF-GAN:Efficient Multilayer Fusion GAN for text-to-image synthesis","authors":"Wenli Chen , Huihuang Zhao","doi":"10.1016/j.cag.2025.104219","DOIUrl":"10.1016/j.cag.2025.104219","url":null,"abstract":"<div><div>Text-to-image generation is a challenging and significant research task. It aims to synthesize high-quality images that match the given descriptive statements. Existing methods still have problems in generating semantic information fusion insufficiently, and the generated images cannot represent the descriptive statements properly. Therefore, A novel method named EMF-GAN (Efficient Multilayer Fusion Generative Adversarial Network) is proposed. It uses a Multilayer Fusion Module (MF Module) and Efficient Multi-Scale Attention Module (EMA Module) to fuse the semantic information into the feature maps gradually. It realizes the full utilization of the semantic information and obtains high-quality realistic images. Extensive experimental results show that our EMF-GAN is highly competitive in image generation quality and semantic consistency. Compared with the state-of-the-art methods, EMF-GAN shows significant performance improvement on both CUB (FID from 14.81 to 10.74) and COCO (FID from 19.32 to 16.86) datasets. It can generate photorealistic images with richer details and text-image consistency. Code can be found at <span><span>https://github.com/zxcnmmmmm/EMF-GAN-master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"128 ","pages":"Article 104219"},"PeriodicalIF":2.5,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-level feature fusion parallel branching networks for point cloud learning","authors":"Biao Yan , Zhiyong Tao , Sen Lin , Heng Li","doi":"10.1016/j.cag.2025.104221","DOIUrl":"10.1016/j.cag.2025.104221","url":null,"abstract":"<div><div>As a 3D data representation format, point cloud aims to preserve the original geometric information in 3D space. Researchers have developed convolutional networks based on graph structures to overcome the sparse nature of point cloud. However, due to traditional graph convolutional networks’ shallow layers, obtaining the point cloud’s deep semantic information is complicated. This paper proposes a parallel branching network for multi-level point cloud feature fusion. The shallow feature branch constructs the local graph structure of the point cloud by the k-Nearest Neighbor (kNN) algorithm and then uses Multi-Layer Perceptrons (MLPs) to learn the local features of the point cloud. In the deep feature branch, we design a Sampling-Grouping (SG) module to down-sample the point cloud in multiple stages normalize the point cloud to improve the network performance, and then perform feature learning based on the residual network. The proposed network has been tested on benchmark datasets, including ModelNet40, ScanObjectNN, and ShapeNet Part. Our method outperforms most classical algorithms methods in the extensive classification and segmentation datasets in quantitative and qualitative evaluation metrics.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"128 ","pages":"Article 104221"},"PeriodicalIF":2.5,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143816298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixuan Li , Baoning Ji , Jie Zhang , Jiazhen Pang , Weibo Li
{"title":"Implicit relevance inference for assembly CAD model retrieval based on design correlation representation","authors":"Yixuan Li , Baoning Ji , Jie Zhang , Jiazhen Pang , Weibo Li","doi":"10.1016/j.cag.2025.104220","DOIUrl":"10.1016/j.cag.2025.104220","url":null,"abstract":"<div><div>Assembly retrieval is a crucial technology for leveraging the extensive design knowledge embedded in CAD product instances. Current methods predominantly employ pairwise similarity measurements, which treat each product model as an isolated entity and overlook the intricate design correlations that reveal high-level design development relationships. To enhance the comprehension of product design correlations within retrieval systems, this paper introduces a novel method for implicit relevance inference in assembly retrieval based on design correlation. We define a part co-occurring relationship to capture the design correlations among assemblies by clustering parts based on shape similarity. At a higher level, all assemblies in the database are constructed as a multiple correlation network based on hypergraph, where the hyperedges represent the part co-occurring relationships. For a given query assembly, the implicit relevance between the query and other assemblies can be calculated by network structure inference. The problem is solved by using a random walk algorithm on the assembly hypergraph network. Comprehensive experiments have shown the effectiveness of the proposed assembly retrieval approach. The proposed method can be seen as an extension of existing pairwise similarity retrieval by further considering assembly relevance, which shows it has versatility and can enhance the effectiveness of existing pairwise similarity retrieval methods.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"128 ","pages":"Article 104220"},"PeriodicalIF":2.5,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143807721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trust at every step: Embedding trust quality gates into the visual data exploration loop for machine learning-based clinical decision support systems","authors":"Dario Antweiler, Georg Fuchs","doi":"10.1016/j.cag.2025.104212","DOIUrl":"10.1016/j.cag.2025.104212","url":null,"abstract":"<div><div>Recent advancements in machine learning (ML) support novel applications in healthcare, most significantly clinical decision support systems (CDSS). The lack of trust hinders acceptance and is one of the main reasons for the limited number of successful implementations in clinical practice. Visual analytics enables the development of trustworthy ML models by providing versatile interactions and visualizations for both data scientists and healthcare professionals (HCPs). However, specific support for HCPs to build trust towards ML models through visual analytics remains underexplored. We propose an extended visual data exploration methodology to enhance trust in ML-based healthcare applications. Based on a literature review on trustworthiness of CDSS, we analyze emerging themes and their implications. By introducing trust quality gates mapped onto the Visual Data Exploration Loop, we provide structured checkpoints for multidisciplinary teams to assess and build trust. We demonstrate the applicability of this methodology in three real-world use cases – policy development, plausibility testing, and model optimization – highlighting its potential to advance trustworthy ML in the healthcare domain.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"128 ","pages":"Article 104212"},"PeriodicalIF":2.5,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianwu Long , Shuang Chen , Kaixin Zhang , Yuanqin Liu , Qi Luo , Yuten Chen
{"title":"Global Sparse Texture Filtering for edge preservation and structural extraction","authors":"Jianwu Long , Shuang Chen , Kaixin Zhang , Yuanqin Liu , Qi Luo , Yuten Chen","doi":"10.1016/j.cag.2025.104213","DOIUrl":"10.1016/j.cag.2025.104213","url":null,"abstract":"<div><div>Extracting meaningful structures from complex texture images remains a significant challenge. Texture image smoothing seeks to retain essential structures while eliminating textures, noise and irrelevant details. However, existing smoothing algorithms often degrade small or weak structural edges when reducing dominant textures. To address this limitation, we propose a novel Global Sparse Texture Filtering (GSTF) algorithm for image smoothing. Our method introduces a texture suppression function that compresses large-scale textures while preserving smaller structures, a window variation mapping is formulated. Combined with window total variation, and leads to the derivation of a novel regularization term. Furthermore, we apply a sparse <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> norm <span><math><mrow><mo>(</mo><mn>0</mn><mo><</mo><mi>p</mi><mo>≤</mo><mn>1</mn><mo>)</mo></mrow></math></span> to constrain the penalty term, enabling the effective smoothing of multi-scale textures while preserving finer edges. Extensive experiments show that the proposed method is both highly effective and superior to existing techniques.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"128 ","pages":"Article 104213"},"PeriodicalIF":2.5,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}