利用生成模型在有限数据集上进行物体检测的迁移学习

IF 4.6 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2024-08-11 DOI:10.1088/2632-2153/ad65b5

M Paiano, S Martina, C Giannelli and F Caruso

{"title":"利用生成模型在有限数据集上进行物体检测的迁移学习","authors":"M Paiano, S Martina, C Giannelli and F Caruso","doi":"10.1088/2632-2153/ad65b5","DOIUrl":null,"url":null,"abstract":"The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"192 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transfer learning with generative models for object detection on limited datasets\",\"authors\":\"M Paiano, S Martina, C Giannelli and F Caruso\",\"doi\":\"10.1088/2632-2153/ad65b5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.\",\"PeriodicalId\":33757,\"journal\":{\"name\":\"Machine Learning Science and Technology\",\"volume\":\"192 1\",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning Science and Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ad65b5\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad65b5","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在某些领域，数据的可用性是有限的，特别是在物体检测任务中，需要在每个物体周围有正确标注的边界框。海洋生物学领域就是这种数据匮乏的一个显著例子，在该领域，开发用于环境监测的海底物种自动检测方法非常有用。为了解决这种数据限制，最先进的机器学习策略主要采用两种方法。第一种方法是在现有数据集上对模型进行预训练，然后再将其推广到感兴趣的特定领域。第二种策略是使用复制粘贴技术或临时模拟器等方法创建专门针对目标领域的合成数据集。第一种策略通常会面临重大的领域转变，而第二种策略则需要为特定任务量身定制解决方案。为了应对这些挑战，我们在这里提出了一个适用于通用场景的迁移学习框架。在这个框架中，生成的图像有助于提高物体检测器在少量真实数据环境中的性能。这是通过在大型通用数据集上预训练的基于扩散的生成模型实现的。与最先进的技术相比，我们发现无需对特定兴趣领域的生成模型进行微调。我们认为这是一个重要的进步，因为它减轻了物体检测任务中人工标记图像的劳动密集型任务。我们以水下环境中的鱼类和城市环境中更常见的汽车领域为重点，对我们的方法进行了验证。我们的方法只使用了几百个输入数据，就实现了与在数千张图像上训练的模型相当的检测性能。我们的研究成果为新的基于生成式人工智能的协议铺平了道路，该协议适用于从地球物理学到生物学和医学等各个领域的机器学习应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transfer learning with generative models for object detection on limited datasets

The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.