基于高级解释器深度架构的自动标记图像分割数据集的增强生成

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Marcos Sergio Pacheco dos Santos Lima Junior , Ezequiel López-Rubio , Juan Miguel Ortiz-de-Lazcano-Lobato , José David Fernández-Rodríguez
{"title":"基于高级解释器深度架构的自动标记图像分割数据集的增强生成","authors":"Marcos Sergio Pacheco dos Santos Lima Junior ,&nbsp;Ezequiel López-Rubio ,&nbsp;Juan Miguel Ortiz-de-Lazcano-Lobato ,&nbsp;José David Fernández-Rodríguez","doi":"10.1016/j.patrec.2025.04.021","DOIUrl":null,"url":null,"abstract":"<div><div>Large image datasets with annotated pixel-level semantics are necessary to train and evaluate supervised deep-learning models. These datasets are very expensive in terms of the human effort required to build them. Still, recent developments such as DatasetGAN open the possibility of leveraging generative systems to automatically synthesise massive amounts of images along with pixel-level information. This work analyses DatasetGAN and proposes a novel architecture that utilises the semantic information of neighbouring pixels to achieve significantly better performance. Additionally, the overfitting observed in the original architecture is thoroughly investigated, and modifications are proposed to mitigate it. Furthermore, the implementation has been redesigned to greatly reduce the memory requirements of DatasetGAN, and a comprehensive study of the impact of the number of classes in the segmentation task is presented.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"193 ","pages":"Pages 101-107"},"PeriodicalIF":3.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced generation of automatically labelled image segmentation datasets by advanced style interpreter deep architectures\",\"authors\":\"Marcos Sergio Pacheco dos Santos Lima Junior ,&nbsp;Ezequiel López-Rubio ,&nbsp;Juan Miguel Ortiz-de-Lazcano-Lobato ,&nbsp;José David Fernández-Rodríguez\",\"doi\":\"10.1016/j.patrec.2025.04.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Large image datasets with annotated pixel-level semantics are necessary to train and evaluate supervised deep-learning models. These datasets are very expensive in terms of the human effort required to build them. Still, recent developments such as DatasetGAN open the possibility of leveraging generative systems to automatically synthesise massive amounts of images along with pixel-level information. This work analyses DatasetGAN and proposes a novel architecture that utilises the semantic information of neighbouring pixels to achieve significantly better performance. Additionally, the overfitting observed in the original architecture is thoroughly investigated, and modifications are proposed to mitigate it. Furthermore, the implementation has been redesigned to greatly reduce the memory requirements of DatasetGAN, and a comprehensive study of the impact of the number of classes in the segmentation task is presented.</div></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"193 \",\"pages\":\"Pages 101-107\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167865525001540\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525001540","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

具有带注释的像素级语义的大型图像数据集是训练和评估监督深度学习模型所必需的。就构建这些数据集所需的人力而言,这些数据集非常昂贵。尽管如此,最近的发展,如DatasetGAN,开启了利用生成系统自动合成大量图像以及像素级信息的可能性。这项工作分析了DatasetGAN,并提出了一种新的架构,利用邻近像素的语义信息来实现更好的性能。此外,对原始结构中观察到的过拟合进行了彻底的研究,并提出了相应的修改以减轻这种情况。此外,对实现进行了重新设计,大大降低了DatasetGAN的内存需求,并对类数对分割任务的影响进行了全面的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhanced generation of automatically labelled image segmentation datasets by advanced style interpreter deep architectures
Large image datasets with annotated pixel-level semantics are necessary to train and evaluate supervised deep-learning models. These datasets are very expensive in terms of the human effort required to build them. Still, recent developments such as DatasetGAN open the possibility of leveraging generative systems to automatically synthesise massive amounts of images along with pixel-level information. This work analyses DatasetGAN and proposes a novel architecture that utilises the semantic information of neighbouring pixels to achieve significantly better performance. Additionally, the overfitting observed in the original architecture is thoroughly investigated, and modifications are proposed to mitigate it. Furthermore, the implementation has been redesigned to greatly reduce the memory requirements of DatasetGAN, and a comprehensive study of the impact of the number of classes in the segmentation task is presented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition Letters
Pattern Recognition Letters 工程技术-计算机:人工智能
CiteScore
12.40
自引率
5.90%
发文量
287
审稿时长
9.1 months
期刊介绍: Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信