Flexible content-aware image synthesis for maritime tasks with diffusion models

IF 4.4 2区工程技术 Q1 ENGINEERING, OCEAN

Applied Ocean Research Pub Date : 2025-03-19 DOI:10.1016/j.apor.2025.104511

Zhenfeng Xue , Yuanqi Hu , Ankang Lu , Zhuo Chen , Ying Zang , Zhonghua Miao

{"title":"Flexible content-aware image synthesis for maritime tasks with diffusion models","authors":"Zhenfeng Xue , Yuanqi Hu , Ankang Lu , Zhuo Chen , Ying Zang , Zhonghua Miao","doi":"10.1016/j.apor.2025.104511","DOIUrl":null,"url":null,"abstract":"<div><div>Maritime environmental perception suffers greatly from data lack due to the high cost of data collection at sea. In this paper, a novel image synthesis method is proposed to automatically generate target images with diverse foreground and background. Specifically, foreground images for various poses are generated using a diffusion model, presenting different modalities of the detected target. The environment conditions of the background images are flexibly adjusted by inputting semantic prompts to another diffusion model. Then a 3D affine diffusion model is proposed for effective fusion of foreground and background. This module calculates the size and position of the foreground image within the background image through affine transformation, and utilizes the excellent image fusion ability of the diffusion model to achieve high-quality image synthesis. As a result, a set of dynamically variable foreground and background images are generated to increase the pose and weather diversity of maritime object detection samples. Extensive experiments are conducted to verify the effectiveness of image synthesis algorithms, and this method can also serve downstream tasks, effectively improving the accuracy of maritime environmental perception algorithms. The code is available at <span><span>https://github.com/xuezhen2018/flexible_content_aware_image_synthesis</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":8261,"journal":{"name":"Applied Ocean Research","volume":"158 ","pages":"Article 104511"},"PeriodicalIF":4.4000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Ocean Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141118725000999","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, OCEAN","Score":null,"Total":0}

引用次数: 0

Abstract

Maritime environmental perception suffers greatly from data lack due to the high cost of data collection at sea. In this paper, a novel image synthesis method is proposed to automatically generate target images with diverse foreground and background. Specifically, foreground images for various poses are generated using a diffusion model, presenting different modalities of the detected target. The environment conditions of the background images are flexibly adjusted by inputting semantic prompts to another diffusion model. Then a 3D affine diffusion model is proposed for effective fusion of foreground and background. This module calculates the size and position of the foreground image within the background image through affine transformation, and utilizes the excellent image fusion ability of the diffusion model to achieve high-quality image synthesis. As a result, a set of dynamically variable foreground and background images are generated to increase the pose and weather diversity of maritime object detection samples. Extensive experiments are conducted to verify the effectiveness of image synthesis algorithms, and this method can also serve downstream tasks, effectively improving the accuracy of maritime environmental perception algorithms. The code is available at https://github.com/xuezhen2018/flexible_content_aware_image_synthesis.

Abstract Image

查看原文本刊更多论文

基于扩散模型的海事任务柔性内容感知图像合成

由于海上数据采集成本高，海洋环境感知受到数据缺乏的严重影响。本文提出了一种新的图像合成方法，用于自动生成具有不同前景和背景的目标图像。具体来说，使用扩散模型生成各种姿态的前景图像，呈现被检测目标的不同形态。通过向另一扩散模型输入语义提示，灵活调整背景图像的环境条件。在此基础上，提出了一种有效融合前景和背景的三维仿射扩散模型。该模块通过仿射变换计算前景图像在背景图像中的大小和位置，利用扩散模型出色的图像融合能力实现高质量的图像合成。从而生成一组动态变化的前景和背景图像，以增加海事目标检测样本的姿态和天气多样性。通过大量实验验证了图像合成算法的有效性，该方法还可以服务于下游任务，有效提高了海洋环境感知算法的精度。代码可在https://github.com/xuezhen2018/flexible_content_aware_image_synthesis上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Ocean Research 地学-工程：大洋

CiteScore

8.70

自引率

7.00%

发文量

316

审稿时长

59 days

期刊介绍： The aim of Applied Ocean Research is to encourage the submission of papers that advance the state of knowledge in a range of topics relevant to ocean engineering.