Maxime Goubeaud, Nicolla Gmyrek, Farzin Ghorban, Lucas Schelkes, A. Kummert
{"title":"Random Noise Boxes: Data Augmentation for Spectrograms","authors":"Maxime Goubeaud, Nicolla Gmyrek, Farzin Ghorban, Lucas Schelkes, A. Kummert","doi":"10.1109/PIC53636.2021.9687058","DOIUrl":null,"url":null,"abstract":"In machine learning, data augmentation is commonly used to generate synthetic samples in order to augment datasets used to train models. The motivation behind data augmentation is to reduce the error-rate of models by increasing the diversity in the dataset. In this paper, we present a new data augmentation method for spectrograms of time series that we name Random Noise Boxes. Random Noise Boxes works by multiplying each spectrogram in a dataset with a predefined number of identical spectrograms and thereafter replacing randomly chosen square-sized parts of the resulting spectrograms with boxes of random noise pixels. We demonstrate the effectiveness of the proposed method by conducting experiments using differentsized CNN classifiers evaluated on nine well-known datasets from the UCR Time Series Classification Archive. We show that our method is beneficial in most cases, as we observe an increase of accuracy and F1-Score on most datasets.","PeriodicalId":297239,"journal":{"name":"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Progress in Informatics and Computing (PIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PIC53636.2021.9687058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In machine learning, data augmentation is commonly used to generate synthetic samples in order to augment datasets used to train models. The motivation behind data augmentation is to reduce the error-rate of models by increasing the diversity in the dataset. In this paper, we present a new data augmentation method for spectrograms of time series that we name Random Noise Boxes. Random Noise Boxes works by multiplying each spectrogram in a dataset with a predefined number of identical spectrograms and thereafter replacing randomly chosen square-sized parts of the resulting spectrograms with boxes of random noise pixels. We demonstrate the effectiveness of the proposed method by conducting experiments using differentsized CNN classifiers evaluated on nine well-known datasets from the UCR Time Series Classification Archive. We show that our method is beneficial in most cases, as we observe an increase of accuracy and F1-Score on most datasets.