{"title":"SGBGAN: minority class image generation for class-imbalanced datasets","authors":"Qian Wan, Wenhui Guo, Yanjiang Wang","doi":"10.1007/s00138-023-01506-y","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Class imbalance frequently arises in the context of image classification. Conventional generative adversarial networks (GANs) have a tendency to produce samples from the majority class when trained on class-imbalanced datasets. To address this issue, the Balancing GAN with gradient penalty (BAGAN-GP) has been proposed, but the outcomes may still exhibit a bias toward the majority categories when the similarity between images from different categories is substantial. In this study, we introduce a novel approach called the Pre-trained Gated Variational Autoencoder with Self-attention for Balancing Generative Adversarial Network (SGBGAN) as an image augmentation technique for generating high-quality images. The proposed method utilizes a Gated Variational Autoencoder with Self-attention (SA-GVAE) to initialize the GAN and transfers pre-trained SA-GVAE weights to the GAN. Our experimental results on Fashion-MNIST, CIFAR-10, and a highly unbalanced medical image dataset demonstrate that the SGBGAN outperforms other state-of-the-art methods. Results on Fréchet inception distance (FID) and structural similarity measures (SSIM) show that our model overcomes the instability problems that exist in other GANs. Especially on the Cells dataset, the FID of a minority class increases up to 23.09% compared to the latest BAGAN-GP, and the SSIM of a minority class increases up to 10.81%. It is proved that SGBGAN overcomes the class imbalance restriction and generates high-quality minority class images.\n</p><h3 data-test=\"abstract-sub-heading\">Graphical abstract</h3><p>The diagram provides an overview of the technical approach employed in this research paper. To address the issue of class imbalance within the dataset, a novel technique called the Gated Variational Autoencoder with Self-attention (SA-GVAE) is proposed. This SA-GVAE is utilized to initialize the Generative Adversarial Network (GAN), with the pre-trained weights from SA-GVAE being transferred to the GAN. Consequently, a Pre-trained Gated Variational Autoencoder with Self-attention for Balancing GAN (SGBGAN) is formed, serving as an image augmentation tool to generate high-quality images. Ultimately, the generation of minority samples is employed to restore class balance within the dataset.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"200 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-023-01506-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Class imbalance frequently arises in the context of image classification. Conventional generative adversarial networks (GANs) have a tendency to produce samples from the majority class when trained on class-imbalanced datasets. To address this issue, the Balancing GAN with gradient penalty (BAGAN-GP) has been proposed, but the outcomes may still exhibit a bias toward the majority categories when the similarity between images from different categories is substantial. In this study, we introduce a novel approach called the Pre-trained Gated Variational Autoencoder with Self-attention for Balancing Generative Adversarial Network (SGBGAN) as an image augmentation technique for generating high-quality images. The proposed method utilizes a Gated Variational Autoencoder with Self-attention (SA-GVAE) to initialize the GAN and transfers pre-trained SA-GVAE weights to the GAN. Our experimental results on Fashion-MNIST, CIFAR-10, and a highly unbalanced medical image dataset demonstrate that the SGBGAN outperforms other state-of-the-art methods. Results on Fréchet inception distance (FID) and structural similarity measures (SSIM) show that our model overcomes the instability problems that exist in other GANs. Especially on the Cells dataset, the FID of a minority class increases up to 23.09% compared to the latest BAGAN-GP, and the SSIM of a minority class increases up to 10.81%. It is proved that SGBGAN overcomes the class imbalance restriction and generates high-quality minority class images.
Graphical abstract
The diagram provides an overview of the technical approach employed in this research paper. To address the issue of class imbalance within the dataset, a novel technique called the Gated Variational Autoencoder with Self-attention (SA-GVAE) is proposed. This SA-GVAE is utilized to initialize the Generative Adversarial Network (GAN), with the pre-trained weights from SA-GVAE being transferred to the GAN. Consequently, a Pre-trained Gated Variational Autoencoder with Self-attention for Balancing GAN (SGBGAN) is formed, serving as an image augmentation tool to generate high-quality images. Ultimately, the generation of minority samples is employed to restore class balance within the dataset.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.