RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation

IEEE transactions on artificial intelligence Pub Date : 2024-08-01 DOI:10.1109/TAI.2024.3436538

Renato Sortino;Thomas Cecconello;Andrea De Marco;Giuseppe Fiameni;Andrea Pilzer;Daniel Magro;Andrew M. Hopkins;Simone Riggi;Eva Sciacca;Adriano Ingallinera;Cristobal Bordiu;Filomena Bufano;Concetto Spampinato

{"title":"RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation","authors":"Renato Sortino;Thomas Cecconello;Andrea De Marco;Giuseppe Fiameni;Andrea Pilzer;Daniel Magro;Andrew M. Hopkins;Simone Riggi;Eva Sciacca;Adriano Ingallinera;Cristobal Bordiu;Filomena Bufano;Concetto Spampinato","doi":"10.1109/TAI.2024.3436538","DOIUrl":null,"url":null,"abstract":"Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 12","pages":"6524-6535"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10620071/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Along with the nearing completion of the square kilometer array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based object detection and semantic segmentation models have proven to be suitable for this purpose. However, training such deep networks requires a high volume of labeled data, which is not trivial to obtain in the context of radio astronomy. Since data needs to be manually labeled by experts, this process is not scalable to large dataset sizes, limiting the possibilities of leveraging deep networks to address several tasks. In this work, we propose RADiff, a generative approach based on conditional diffusion models trained over an annotated radio dataset to generate synthetic images, containing radio sources of different morphologies, to augment existing datasets and reduce the problems caused by class imbalances. We also show that it is possible to generate fully synthetic image-annotation pairs to automatically augment any annotated dataset. We evaluate the effectiveness of this approach by training a semantic segmentation model on a real dataset augmented in two ways: 1) using synthetic images obtained from real masks; and 2) generating images from synthetic semantic masks. Finally, we also show how the model can be applied to populate background noise maps for simulating radio maps for data challenges.

查看原文本刊更多论文

RADiff：射电天文地图生成的可控扩散模型

随着平方公里阵列（SKA）的接近完工，对准确可靠的自动化解决方案的需求不断增加，以便从大量数据中提取有价值的信息。在这种情况下，自动查找源是一项特别重要的任务，因为它使天文物体的检测和分类成为可能。基于深度学习的对象检测和语义分割模型已被证明适合于这一目的。然而，训练这样的深度网络需要大量的标记数据，这在射电天文学的背景下是不容易获得的。由于数据需要由专家手动标记，因此该过程无法扩展到大型数据集，从而限制了利用深度网络解决多个任务的可能性。在这项工作中，我们提出了RADiff，这是一种基于在带注释的无线电数据集上训练的条件扩散模型的生成方法，用于生成包含不同形态射电源的合成图像，以增强现有数据集并减少类不平衡引起的问题。我们还展示了生成完全合成的图像注释对以自动增加任何注释数据集的可能性。我们通过在两种方式增强的真实数据集上训练语义分割模型来评估该方法的有效性：1)使用从真实掩模中获得的合成图像；2)合成语义掩码生成图像。最后，我们还展示了如何将该模型应用于填充背景噪声图，以模拟数据挑战的无线电地图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量