Zhenfeng Xue , Yuanqi Hu , Ankang Lu , Zhuo Chen , Ying Zang , Zhonghua Miao
{"title":"Flexible content-aware image synthesis for maritime tasks with diffusion models","authors":"Zhenfeng Xue , Yuanqi Hu , Ankang Lu , Zhuo Chen , Ying Zang , Zhonghua Miao","doi":"10.1016/j.apor.2025.104511","DOIUrl":null,"url":null,"abstract":"<div><div>Maritime environmental perception suffers greatly from data lack due to the high cost of data collection at sea. In this paper, a novel image synthesis method is proposed to automatically generate target images with diverse foreground and background. Specifically, foreground images for various poses are generated using a diffusion model, presenting different modalities of the detected target. The environment conditions of the background images are flexibly adjusted by inputting semantic prompts to another diffusion model. Then a 3D affine diffusion model is proposed for effective fusion of foreground and background. This module calculates the size and position of the foreground image within the background image through affine transformation, and utilizes the excellent image fusion ability of the diffusion model to achieve high-quality image synthesis. As a result, a set of dynamically variable foreground and background images are generated to increase the pose and weather diversity of maritime object detection samples. Extensive experiments are conducted to verify the effectiveness of image synthesis algorithms, and this method can also serve downstream tasks, effectively improving the accuracy of maritime environmental perception algorithms. The code is available at <span><span>https://github.com/xuezhen2018/flexible_content_aware_image_synthesis</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":8261,"journal":{"name":"Applied Ocean Research","volume":"158 ","pages":"Article 104511"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Ocean Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141118725000999","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, OCEAN","Score":null,"Total":0}
引用次数: 0
Abstract
Maritime environmental perception suffers greatly from data lack due to the high cost of data collection at sea. In this paper, a novel image synthesis method is proposed to automatically generate target images with diverse foreground and background. Specifically, foreground images for various poses are generated using a diffusion model, presenting different modalities of the detected target. The environment conditions of the background images are flexibly adjusted by inputting semantic prompts to another diffusion model. Then a 3D affine diffusion model is proposed for effective fusion of foreground and background. This module calculates the size and position of the foreground image within the background image through affine transformation, and utilizes the excellent image fusion ability of the diffusion model to achieve high-quality image synthesis. As a result, a set of dynamically variable foreground and background images are generated to increase the pose and weather diversity of maritime object detection samples. Extensive experiments are conducted to verify the effectiveness of image synthesis algorithms, and this method can also serve downstream tasks, effectively improving the accuracy of maritime environmental perception algorithms. The code is available at https://github.com/xuezhen2018/flexible_content_aware_image_synthesis.
期刊介绍:
The aim of Applied Ocean Research is to encourage the submission of papers that advance the state of knowledge in a range of topics relevant to ocean engineering.