Computers & Graphics-Uk最新文献_第3页

StruGS: Structurally consistent 3D Gaussian Splatting with targeted optimization strategies StruGS：结构一致的三维高斯溅射与目标优化策略

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-16 DOI: 10.1016/j.cag.2025.104440

Guoying Pang , Kefeng Li , Guangyuan Zhang , Yufei Peng , Xiaotong Li , Jiayi Yu , Zhenfang Zhu , Peng Wang , Zhenfei Wang , Chen Fu

{"title":"StruGS: Structurally consistent 3D Gaussian Splatting with targeted optimization strategies","authors":"Guoying Pang , Kefeng Li , Guangyuan Zhang , Yufei Peng , Xiaotong Li , Jiayi Yu , Zhenfang Zhu , Peng Wang , Zhenfei Wang , Chen Fu","doi":"10.1016/j.cag.2025.104440","DOIUrl":"10.1016/j.cag.2025.104440","url":null,"abstract":"<div><div>This paper proposes StruGS, a structural consistency-oriented optimization framework for 3D Gaussian Splatting, aiming to address the insufficient structural consistency observed in existing 3DGS methods during the representation process. This method introduces a collaborative structural optimization strategy from both the view and spatial dimensions. First, the structure-aware multi-view guidance strategy aggregates gradient signals from multiple views during training and utilizes a set of learnable structure-aware mapping parameters to guide the model to more effectively focus on structurally salient regions, thereby comprehensively enhancing the consistency of three-dimensional structural representation. Second, the structural gradient optimization balancing strategy dynamically adjusts gradients based on the depth information of each Gaussian point, ensuring a more balanced gradient optimization process across spatial regions, improving structural stability, and effectively mitigating the emergence of floater artifacts. These two strategies collaborate from the dimensions of multi-view structural guidance and spatial structural optimization balancing, enhancing structural consistency in modeling. Experimental results demonstrate that StruGS significantly improves consistency and stability in geometric structure representation and achieves high-quality novel view synthesis across multiple public datasets.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104440"},"PeriodicalIF":2.8,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Flattening-based visualization of supine breast MRI 仰卧位乳房MRI平面化可视化

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-16 DOI: 10.1016/j.cag.2025.104395

Julia Kummer , Elmar Laistler , Lena Nohava , Renata G. Raidou , Katja Bühler

{"title":"Flattening-based visualization of supine breast MRI","authors":"Julia Kummer , Elmar Laistler , Lena Nohava , Renata G. Raidou , Katja Bühler","doi":"10.1016/j.cag.2025.104395","DOIUrl":"10.1016/j.cag.2025.104395","url":null,"abstract":"<div><div>We propose two novel visualization methods optimized for supine breast images that “flatten” breast tissue, facilitating examination of larger tissue areas within each coronal slice. Breast cancer is the most frequently diagnosed cancer in women, and early lesion detection is crucial for reducing mortality. Supine breast magnetic resonance imaging (MRI) enables better lesion localization for image-guided interventions; however, traditional axial visualization is suboptimal because the tissue spreads over the chest wall, resulting in numerous fragmented slices that radiologists must scroll through during standard interpretation. Using a human-centered design approach, we incorporated user and expert feedback throughout the co-design and evaluation stages of our flattening methods. Our first proposed method, a <em>surface-cutting</em> approach, generates offset surfaces and flattens them independently using As-Rigid-As-Possible (ARAP) surface mesh parameterization. The second method uses a <em>landmark-based warp</em> to flatten the entire breast volume at once. Expert evaluations revealed that the surface-cutting method provides intuitive overviews and clear vascular detail, with low metric (2–2.5%) and area (3.7–4.4%) distortions. However, independent slice flattening can introduce depth distortions across layers. The landmark warp offers consistent slice alignment and supports direct annotations and measurements, with radiologists favoring it for its anatomical accuracy. Both methods significantly reduced the number of slices needed to review, highlighting their potential for time savings and clinical impact — an essential factor for adopting supine MRI.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104395"},"PeriodicalIF":2.8,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145110079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data-driven modeling of subtle eye region deformations 眼部细微变形的数据驱动建模

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-15 DOI: 10.1016/j.cag.2025.104425

Glenn Kerbiriou , Quentin Avril , Maud Marchal

引用次数: 0

A unified framework for interactive visual graph matching via attribute-structure synchronization 基于属性-结构同步的交互式可视化图形匹配统一框架

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-15 DOI: 10.1016/j.cag.2025.104406

Yuhua Liu , Haoxuan Wang , Jiajia Kou , Ling Sun , Heyu Wang , Yongheng Wang , Yigang Wang , Jinchang Li , Zhiguang Zhou

{"title":"A unified framework for interactive visual graph matching via attribute-structure synchronization","authors":"Yuhua Liu , Haoxuan Wang , Jiajia Kou , Ling Sun , Heyu Wang , Yongheng Wang , Yigang Wang , Jinchang Li , Zhiguang Zhou","doi":"10.1016/j.cag.2025.104406","DOIUrl":"10.1016/j.cag.2025.104406","url":null,"abstract":"<div><div>In traditional graph retrieval tools, graph matching is commonly used to retrieve desired graphs from extensive graph datasets according to their structural similarities. However, in real applications, graph nodes have numerous attributes which also contain valuable information for evaluating similarities between graphs. Thus, to achieve superior graph matching results, it is crucial for graph retrieval tools to make full use of the attribute information in addition to structural information. We propose a novel framework for interactive visual graph matching. In the proposed framework, an attribute-structure synchronization method is developed for representing structural and attribute features in a unified embedding space based on Canonical Correlation Analysis (CCA). To support fast and interactive matching, our method provides users with intuitive visual query interfaces for traversing, filtering and searching for the target graph in the embedding space conveniently. With the designed interfaces, the users can also specify a new target graph with desired structural and semantic features. Besides, evaluation views are designed for easy validation and interpretation of the matching results. Case studies and quantitative comparisons on real-world datasets have demonstrated the superiorities of our proposed framework in graph matching and large graph exploration.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104406"},"PeriodicalIF":2.8,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145120118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PBF-FR: Partitioning beyond footprints for façade recognition in urban point clouds PBF-FR：在城市点云中进行地形识别的超越足迹分区

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-15 DOI: 10.1016/j.cag.2025.104399

Daniela Cabiddu , Chiara Romanengo , Michela Mortara

{"title":"PBF-FR: Partitioning beyond footprints for façade recognition in urban point clouds","authors":"Daniela Cabiddu , Chiara Romanengo , Michela Mortara","doi":"10.1016/j.cag.2025.104399","DOIUrl":"10.1016/j.cag.2025.104399","url":null,"abstract":"<div><div>The identification and recognition of urban features are essential for creating accurate and comprehensive digital representations of cities. In particular, the automatic characterization of façade elements plays a key role in enabling semantic enrichment and 3D reconstruction. It also supports urban analysis and underpins various applications, including planning, simulation, and visualization. This work presents a pipeline for the automatic recognition of façades within complex urban scenes represented as point clouds. The method employs an enhanced partitioning strategy that extends beyond strict building footprints by incorporating surrounding buffer zones, allowing for a more complete capture of façade geometry, particularly in dense urban contexts. This is combined with a primitive recognition stage based on the Hough transform, enabling the detection of both planar and curved façade structures. The proposed partitioning overcomes the limitations of traditional footprint-based segmentation, which often disregards contextual geometry and leads to misclassifications at building boundaries. Integrated with the primitive recognition step, the resulting pipeline is robust to noise and incomplete data, and supports geometry-aware façade recognition, contributing to scalable analysis of large-scale urban environments.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104399"},"PeriodicalIF":2.8,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Controllable text-to-3D multi-object generation via integrating layout and multiview patterns 通过集成布局和多视图模式，可控文本到3d的多对象生成

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-15 DOI: 10.1016/j.cag.2025.104353

Shaorong Sun , Shuchao Pang , Yazhou Yao , Xiaoshui Huang

{"title":"Controllable text-to-3D multi-object generation via integrating layout and multiview patterns","authors":"Shaorong Sun , Shuchao Pang , Yazhou Yao , Xiaoshui Huang","doi":"10.1016/j.cag.2025.104353","DOIUrl":"10.1016/j.cag.2025.104353","url":null,"abstract":"<div><div>The controllability of 3D object generation methods is achieved through textual input. Existing text-to-3D object generation methods focus primarily on generating a single object based on a single object description. However, these methods often face challenges in producing results that accurately correspond to our desired positions when the input text involves multiple objects. To address the issue of controllability in the generation of multiple objects, this paper introduces COMOGen, a <strong>CO</strong>ntrollable text-to-3D <strong>M</strong>ulti-<strong>O</strong>bject <strong>Gen</strong>eration framework. COMOGen enables the simultaneous generation of multiple 3D objects by distilling layout and multiview prior knowledge. The framework consists of three modules: the layout control module, the multiview consistency control module, and the 3D content enhancement module. Moreover, to integrate these three modules as an integral framework, we propose Layout Multiview Score Distillation, which unifies two prior knowledge and further enhances the diversity and quality of generated 3D content. Comprehensive experiments demonstrate the effectiveness of our approach compared to state-of-the-art methods. This represents a significant step forward to enable more controlled and versatile text-based 3D content generation.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104353"},"PeriodicalIF":2.8,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145105239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FocalFormer: Leveraging focal modulation for efficient action segmentation in egocentric videos FocalFormer：利用焦点调制在以自我为中心的视频中进行有效的动作分割

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-12 DOI: 10.1016/j.cag.2025.104381

Jialu Xi, Shiguang Liu

{"title":"FocalFormer: Leveraging focal modulation for efficient action segmentation in egocentric videos","authors":"Jialu Xi, Shiguang Liu","doi":"10.1016/j.cag.2025.104381","DOIUrl":"10.1016/j.cag.2025.104381","url":null,"abstract":"<div><div>With the development of various emerging devices (e.g., AR/VR) and video dissemination technologies, self-centered video tasks have received much attention, and it is especially important to understand user actions in self-centered videos, where self-centered temporal action segmentation complicates the task due to its unique challenges such as abrupt point-of-view shifts and limited field of view. Existing work employs Transformer-based architectures to model long-range dependencies in sequential data. However, these models often struggle to effectively accommodate the nuances of egocentric action segmentation and incur significant computational costs. Therefore, we propose a new framework that integrates focus modulation into the Transformer architecture. Unlike the traditional self-attention mechanism, which focuses uniformly on all features in the entire sequence, focus modulation replaces the self-attention layer with a more focused and efficient mechanism. This design allows for selective aggregation of local features and adaptive integration of global context through content-aware gating, which is critical for capturing detailed local motion (e.g., hand-object interactions) and handling dynamic context changes in first-person video. Our model also adds a context integration module, where focus modulation ensures that only relevant global contexts are integrated based on the content of the current frame, ultimately efficiently decoding aggregated features to produce accurate temporal action boundaries. By using focus modulation, our model achieves a lightweight design that reduces the number of parameters typically associated with Transformer-based models. We validate the effectiveness of our approach on classical datasets for temporal segmentation tasks (50salads, breakfast) as well as additional datasets with a first-person perspective (GTEA, HOI4D, and FineBio).</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104381"},"PeriodicalIF":2.8,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Design, development, and evaluation of an immersive augmented virtuality training system for transcatheter aortic valve replacement 经导管主动脉瓣置换术的沉浸式增强虚拟训练系统的设计、开发和评估

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-12 DOI: 10.1016/j.cag.2025.104414

Jorik Jakober , Matthias Kunz , Robert Kreher , Matteo Pantano , Daniel Braß , Janine Weidling , Christian Hansen , Rüdiger Braun-Dullaeus , Bernhard Preim

{"title":"Design, development, and evaluation of an immersive augmented virtuality training system for transcatheter aortic valve replacement","authors":"Jorik Jakober , Matthias Kunz , Robert Kreher , Matteo Pantano , Daniel Braß , Janine Weidling , Christian Hansen , Rüdiger Braun-Dullaeus , Bernhard Preim","doi":"10.1016/j.cag.2025.104414","DOIUrl":"10.1016/j.cag.2025.104414","url":null,"abstract":"<div><div>Strong procedural skills are essential to perform safe and effective transcatheter aortic valve replacement (TAVR). Traditional training takes place in the operating room (OR) on real patients and requires learning new motor skills, resulting in longer procedure times, increased risk of complications, and greater radiation exposure for patients and medical personnel. Desktop-based simulators in interventional cardiology have shown some validity but lack true depth perception, whereas head-mounted display based Virtual Reality (VR) offers intuitive 3D interaction that enhances training effectiveness and spatial understanding. However, providing realistic and immersive training remains a challenging task as both lack tactile feedback. We have developed an augmented virtuality (AV) training system for transfemoral TAVR, combining a catheter tracking device (for translational input) with a simulated virtual OR. The system enables users to manually control a virtual angiography system via hand tracking and navigate a guidewire through a virtual patient up to the aortic valve using fluoroscopic-like imaging. In addition, we conducted a preliminary user study with 12 participants, assessing cybersickness, usability, workload, sense of presence, and qualitative factors. Preliminary results indicate that the system provides realistic interaction for key procedural steps, making it a suitable learning tool for novices. Limitations in angiography system operation include the lack of haptic resistance and usability limitations related to C-arm control, particularly due to hand tracking constraints and split attention between interaction and monitoring. Suggestions for improvement include catheter rotation tracking, expanded procedural coverage, and enhanced fluoroscopic image fidelity.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104414"},"PeriodicalIF":2.8,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145160105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Navigating large-pose challenge for high-fidelity face reenactment with video diffusion model 利用视频扩散模型导航高保真人脸再现的大姿态挑战

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-09 DOI: 10.1016/j.cag.2025.104423

Mingtao Guo , Guanyu Xing , Yanci Zhang , Yanli Liu

{"title":"Navigating large-pose challenge for high-fidelity face reenactment with video diffusion model","authors":"Mingtao Guo , Guanyu Xing , Yanci Zhang , Yanli Liu","doi":"10.1016/j.cag.2025.104423","DOIUrl":"10.1016/j.cag.2025.104423","url":null,"abstract":"<div><div>Face reenactment aims to generate realistic talking head videos by transferring motion from a driving video to a static source image while preserving the source identity. Although existing methods based on either implicit or explicit keypoints have shown promise, they struggle with large pose variations due to warping artifacts or the limitations of coarse facial landmarks. In this paper, we present the Face Reenactment Video Diffusion model (FRVD), a novel framework for high-fidelity face reenactment under large pose changes. Our method first employs a motion extractor to extract implicit facial keypoints from the source and driving images to represent fine-grained motion and to perform motion alignment through a warping module. To address the degradation introduced by warping, we introduce a Warping Feature Mapper (WFM) that maps the warped source image into the motion-aware latent space of a pretrained image-to-video (I2V) model. This latent space encodes rich priors of facial dynamics learned from large-scale video data, enabling effective warping correction and enhancing temporal coherence. Extensive experiments show that FRVD achieves superior performance over existing methods in terms of pose accuracy, identity preservation, and visual quality, especially in challenging scenarios with extreme pose variations.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104423"},"PeriodicalIF":2.8,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145049016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geometry-aware estimation of photovoltaic energy from aerial LiDAR point clouds 基于几何感知的航空激光雷达点云光伏能量估计

IF 2.8 4区计算机科学

Computers & Graphics-Uk Pub Date : 2025-09-09 DOI: 10.1016/j.cag.2025.104424

Chiara Romanengo , Tommaso Sorgente , Daniela Cabiddu , Matteo Ghellere , Lorenzo Belussi , Ludovico Danza , Michela Mortara

{"title":"Geometry-aware estimation of photovoltaic energy from aerial LiDAR point clouds","authors":"Chiara Romanengo , Tommaso Sorgente , Daniela Cabiddu , Matteo Ghellere , Lorenzo Belussi , Ludovico Danza , Michela Mortara","doi":"10.1016/j.cag.2025.104424","DOIUrl":"10.1016/j.cag.2025.104424","url":null,"abstract":"<div><div>Aerial LiDAR (and photogrammetric) surveys are becoming a common practice in land and urban management, and aerial point clouds (or the reconstructed surfaces) are increasingly used as digital representations of natural and built structures for the monitoring and simulation of urban processes or the generation of what-if scenarios. The geometric analysis of a “digital twin” of the built environment can contribute to provide quantitative evidence to support urban policies like planning of interventions and incentives for the transition to renewable energy. In this work, we present a geometry-based approach to efficiently and accurately estimate the photovoltaic (PV) energy produced by urban roofs. The method combines a primitive fitting technique for detecting and characterizing building roof components from aerial LiDAR data with an optimization strategy to determine the maximum number and optimal placement of PV modules on each roof surface. The energy production of the PV system on each building over a specified time period (e.g., one year) is estimated based on the solar radiation received by each PV module and the shadow projected by neighboring buildings or trees and efficiency requirements. The strength of the proposed approach is its ability to combine computational techniques, domain expertise, and heterogeneous data into a logical and automated workflow, whose effectiveness is evaluated and tested on a large-scale, real-world urban areas with complex morphology in Italy.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"132 ","pages":"Article 104424"},"PeriodicalIF":2.8,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0