PatchMix: patch-level mixup for data augmentation in convolutional neural networks

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems Pub Date : 2024-05-30 DOI:10.1007/s10115-024-02141-3

Yichao Hong, Yuanyuan Chen

{"title":"PatchMix: patch-level mixup for data augmentation in convolutional neural networks","authors":"Yichao Hong, Yuanyuan Chen","doi":"10.1007/s10115-024-02141-3","DOIUrl":null,"url":null,"abstract":"<p>Convolutional neural networks (CNNs) have demonstrated impressive performance in fitting data distribution. However, due to the complexity in learning intricate features from data, networks usually experience overfitting during the training. To address this issue, many data augmentation techniques have been proposed to expand the representation of the training data, thereby improving the generalization ability of CNNs. Inspired by jigsaw puzzles, we propose PatchMix, a novel mixup-based augmentation method that applies mixup to patches within an image to extract abundant and varied information from it. At the input level of CNNs, PatchMix can generate a multitude of reliable training samples through an integrated and controllable approach that encompasses cropping, combining, blurring, and more. Additionally, we propose PatchMix-R to enhance the robustness of the model against perturbations by processing adjacent pixels. Easy to implement, our methods can be integrated with most CNN-based classification models and combined with varying data augmentation techniques. The experiments show that PatchMix and PatchMix-R consistently outperform other state-of-the-art methods in terms of accuracy and robustness. Class activation mappings of the trained model are also investigated to visualize the effectiveness of our approach.\n</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"51 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02141-3","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Convolutional neural networks (CNNs) have demonstrated impressive performance in fitting data distribution. However, due to the complexity in learning intricate features from data, networks usually experience overfitting during the training. To address this issue, many data augmentation techniques have been proposed to expand the representation of the training data, thereby improving the generalization ability of CNNs. Inspired by jigsaw puzzles, we propose PatchMix, a novel mixup-based augmentation method that applies mixup to patches within an image to extract abundant and varied information from it. At the input level of CNNs, PatchMix can generate a multitude of reliable training samples through an integrated and controllable approach that encompasses cropping, combining, blurring, and more. Additionally, we propose PatchMix-R to enhance the robustness of the model against perturbations by processing adjacent pixels. Easy to implement, our methods can be integrated with most CNN-based classification models and combined with varying data augmentation techniques. The experiments show that PatchMix and PatchMix-R consistently outperform other state-of-the-art methods in terms of accuracy and robustness. Class activation mappings of the trained model are also investigated to visualize the effectiveness of our approach.

Abstract Image

查看原文本刊更多论文

PatchMix：用于卷积神经网络数据扩增的补丁级混搭

卷积神经网络（CNN）在拟合数据分布方面表现出色。然而，由于从数据中学习复杂特征的复杂性，网络在训练过程中通常会出现过拟合。为了解决这个问题，人们提出了许多数据增强技术来扩展训练数据的表示，从而提高 CNN 的泛化能力。受拼图游戏的启发，我们提出了 PatchMix，这是一种基于混合的新型增强方法，它对图像中的斑块进行混合，以从中提取丰富多样的信息。在 CNN 的输入层，PatchMix 可以通过包含裁剪、组合、模糊等在内的综合可控方法生成大量可靠的训练样本。此外，我们还提出了 PatchMix-R，通过处理相邻像素来增强模型对扰动的鲁棒性。我们的方法易于实现，可与大多数基于 CNN 的分类模型集成，并与各种数据增强技术相结合。实验表明，PatchMix 和 PatchMix-R 在准确性和鲁棒性方面始终优于其他最先进的方法。我们还对训练模型的类激活映射进行了研究，以直观展示我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge and Information Systems 工程技术-计算机：人工智能

CiteScore

5.70

自引率

7.40%

发文量

152

审稿时长

7.2 months

期刊介绍： Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.