Estimating the Required Training Dataset Size for Transmitter Classification Using Deep Learning

2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN) Pub Date : 2019-11-01 DOI:10.1109/DySPAN.2019.8935823

T. Oyedare, J. Park

{"title":"Estimating the Required Training Dataset Size for Transmitter Classification Using Deep Learning","authors":"T. Oyedare, J. Park","doi":"10.1109/DySPAN.2019.8935823","DOIUrl":null,"url":null,"abstract":"Despite the recent surge in the application of deep learning to wireless communication problems, very little is known about the required training dataset size to solve difficult problems with acceptable accuracy, including the problem of transmitter classification. Many researchers use rules-of-thumb to find out how much training data is needed for certain classification or identification tasks. For the artificial neural network (ANN) research, these rules of thumb may suffice, however, for convolutional neural networks (CNN), a class of deep neural networks, these rules of thumb may not hold, and researchers are often left to Figure out the training dataset size needed for accurate classification. In this paper, we investigate the correlation between training dataset size and classification accuracy for transmitter classification applications by investigating whether the rules-of-thumb used in ANN research applies in CNN-based transmitter classification tasks. We predict classification performance of a CNN-based architecture given a dataset size using a power law model and the Levenberg-Marquardt algorithm. We use the chi-squared goodness-of-fit test to validate our predicted model. Our results show that we can predict classification accuracy for larger training dataset sizes with different experimental scenarios with at least 97.5% accuracy. We also compare our scheme with similar prior works in wireless transmitter classification. Finally, we propose a rule-of-thumb for the required training dataset size in transmitter classification using CNNs.","PeriodicalId":278172,"journal":{"name":"2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DySPAN.2019.8935823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 21

Abstract

Despite the recent surge in the application of deep learning to wireless communication problems, very little is known about the required training dataset size to solve difficult problems with acceptable accuracy, including the problem of transmitter classification. Many researchers use rules-of-thumb to find out how much training data is needed for certain classification or identification tasks. For the artificial neural network (ANN) research, these rules of thumb may suffice, however, for convolutional neural networks (CNN), a class of deep neural networks, these rules of thumb may not hold, and researchers are often left to Figure out the training dataset size needed for accurate classification. In this paper, we investigate the correlation between training dataset size and classification accuracy for transmitter classification applications by investigating whether the rules-of-thumb used in ANN research applies in CNN-based transmitter classification tasks. We predict classification performance of a CNN-based architecture given a dataset size using a power law model and the Levenberg-Marquardt algorithm. We use the chi-squared goodness-of-fit test to validate our predicted model. Our results show that we can predict classification accuracy for larger training dataset sizes with different experimental scenarios with at least 97.5% accuracy. We also compare our scheme with similar prior works in wireless transmitter classification. Finally, we propose a rule-of-thumb for the required training dataset size in transmitter classification using CNNs.

查看原文本刊更多论文

使用深度学习估计发射机分类所需的训练数据集大小

尽管最近深度学习在无线通信问题上的应用激增，但人们对以可接受的精度解决难题所需的训练数据集大小知之甚少，包括发射机分类问题。许多研究人员使用经验法则来确定某些分类或识别任务需要多少训练数据。对于人工神经网络(ANN)的研究，这些经验法则可能就足够了，然而对于卷积神经网络(CNN)这类深度神经网络来说，这些经验法则可能就不成立了，研究人员经常需要弄清楚准确分类所需的训练数据集大小。在本文中，我们通过研究ANN研究中使用的经验法则是否适用于基于cnn的发射机分类任务，来研究发射机分类应用中训练数据集大小与分类精度之间的相关性。我们使用幂律模型和Levenberg-Marquardt算法预测给定数据集大小的基于cnn的架构的分类性能。我们使用卡方拟合优度检验来验证我们的预测模型。我们的研究结果表明，在不同的实验场景下，我们可以预测更大的训练数据集的分类准确率，准确率至少为97.5%。并将该方案与前人在无线发射机分类方面的研究成果进行了比较。最后，我们提出了一个使用cnn进行发射机分类所需训练数据集大小的经验法则。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN)

自引率

0.00%

发文量