{"title":"Fine-grained hierarchical dynamics for image harmonization","authors":"Peng He , Jun Yu , Liuxue Ju , Fang Gao","doi":"10.1016/j.neunet.2025.107618","DOIUrl":null,"url":null,"abstract":"<div><div>Image harmonization aims to generate visually consistent composite images by ensuring compatibility between the foreground and background. Existing image harmonization strategies based on the global transformation emphasize using background information for foreground normalization, potentially overlooking significant variations in appearance among regions within various scenes. Simultaneously, the coherence of local information plays a critical role in generating visually consistent images as well. To address these issues, we propose the Hierarchical Dynamics Appearance Translation (HDAT) framework, enabling a seamless transition of features and parameters from local to global views in the network and adaptive adjustments of foreground appearance based on corresponding background information. Specifically, we introduce the dynamic region-aware convolution and fine-grained mixed attention mechanism to promote the harmonious coordination of global and local details. Among them, the dynamic region-aware convolution guided by foreground masks is utilized to learn adaptive representations and correlations of foreground and background elements based on global dynamics. Meanwhile, the fine-grained mixed attention dynamically adjusts features at different channels and positions to achieve local adaptations. Furthermore, we integrate a novel multi-scale feature calibration strategy to ensure information consistency across varying scales. Extensive experiments demonstrate that our HDAT framework significantly reduces the number of network parameters while outperforming existing methods both qualitatively and quantitatively.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107618"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025004988","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Image harmonization aims to generate visually consistent composite images by ensuring compatibility between the foreground and background. Existing image harmonization strategies based on the global transformation emphasize using background information for foreground normalization, potentially overlooking significant variations in appearance among regions within various scenes. Simultaneously, the coherence of local information plays a critical role in generating visually consistent images as well. To address these issues, we propose the Hierarchical Dynamics Appearance Translation (HDAT) framework, enabling a seamless transition of features and parameters from local to global views in the network and adaptive adjustments of foreground appearance based on corresponding background information. Specifically, we introduce the dynamic region-aware convolution and fine-grained mixed attention mechanism to promote the harmonious coordination of global and local details. Among them, the dynamic region-aware convolution guided by foreground masks is utilized to learn adaptive representations and correlations of foreground and background elements based on global dynamics. Meanwhile, the fine-grained mixed attention dynamically adjusts features at different channels and positions to achieve local adaptations. Furthermore, we integrate a novel multi-scale feature calibration strategy to ensure information consistency across varying scales. Extensive experiments demonstrate that our HDAT framework significantly reduces the number of network parameters while outperforming existing methods both qualitatively and quantitatively.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.