{"title":"Estimating the Required Training Dataset Size for Transmitter Classification Using Deep Learning","authors":"T. Oyedare, J. Park","doi":"10.1109/DySPAN.2019.8935823","DOIUrl":null,"url":null,"abstract":"Despite the recent surge in the application of deep learning to wireless communication problems, very little is known about the required training dataset size to solve difficult problems with acceptable accuracy, including the problem of transmitter classification. Many researchers use rules-of-thumb to find out how much training data is needed for certain classification or identification tasks. For the artificial neural network (ANN) research, these rules of thumb may suffice, however, for convolutional neural networks (CNN), a class of deep neural networks, these rules of thumb may not hold, and researchers are often left to Figure out the training dataset size needed for accurate classification. In this paper, we investigate the correlation between training dataset size and classification accuracy for transmitter classification applications by investigating whether the rules-of-thumb used in ANN research applies in CNN-based transmitter classification tasks. We predict classification performance of a CNN-based architecture given a dataset size using a power law model and the Levenberg-Marquardt algorithm. We use the chi-squared goodness-of-fit test to validate our predicted model. Our results show that we can predict classification accuracy for larger training dataset sizes with different experimental scenarios with at least 97.5% accuracy. We also compare our scheme with similar prior works in wireless transmitter classification. Finally, we propose a rule-of-thumb for the required training dataset size in transmitter classification using CNNs.","PeriodicalId":278172,"journal":{"name":"2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DySPAN.2019.8935823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
Despite the recent surge in the application of deep learning to wireless communication problems, very little is known about the required training dataset size to solve difficult problems with acceptable accuracy, including the problem of transmitter classification. Many researchers use rules-of-thumb to find out how much training data is needed for certain classification or identification tasks. For the artificial neural network (ANN) research, these rules of thumb may suffice, however, for convolutional neural networks (CNN), a class of deep neural networks, these rules of thumb may not hold, and researchers are often left to Figure out the training dataset size needed for accurate classification. In this paper, we investigate the correlation between training dataset size and classification accuracy for transmitter classification applications by investigating whether the rules-of-thumb used in ANN research applies in CNN-based transmitter classification tasks. We predict classification performance of a CNN-based architecture given a dataset size using a power law model and the Levenberg-Marquardt algorithm. We use the chi-squared goodness-of-fit test to validate our predicted model. Our results show that we can predict classification accuracy for larger training dataset sizes with different experimental scenarios with at least 97.5% accuracy. We also compare our scheme with similar prior works in wireless transmitter classification. Finally, we propose a rule-of-thumb for the required training dataset size in transmitter classification using CNNs.