{"title":"数据集如何影响基于cnn的图像分类性能?","authors":"Chao Luo, Xiaojie Li, Lutao Wang, Jia He, Denggao Li, Jiliu Zhou","doi":"10.1109/ICSAI.2018.8599448","DOIUrl":null,"url":null,"abstract":"Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.","PeriodicalId":375852,"journal":{"name":"2018 5th International Conference on Systems and Informatics (ICSAI)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"How Does the Data set Affect CNN-based Image Classification Performance?\",\"authors\":\"Chao Luo, Xiaojie Li, Lutao Wang, Jia He, Denggao Li, Jiliu Zhou\",\"doi\":\"10.1109/ICSAI.2018.8599448\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.\",\"PeriodicalId\":375852,\"journal\":{\"name\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 5th International Conference on Systems and Informatics (ICSAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI.2018.8599448\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 5th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2018.8599448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
How Does the Data set Affect CNN-based Image Classification Performance?
Convolutional neural networks (ConvNets or CNNs) have been proven very effective in areas such as image recognition and classification. Especially in the field of image classification, the CNN-based method has achieved excellent performance. Performance is an important indicator for evaluating whether a CNN-based classification method is excellent, so it is important to study which factors affect performance. As we all know, image classification performance is affected by the network structure itself and the size of the data set. In particular, data set size have a significant impact on performance. While for most people, a large number of data set are difficult to obtain. Thus, we consider a question of this approach: How does the size of the data set affect performance? In order to clarify this issue, there are 35 groups experiment performed with 5 times experiment in each group (175 experiments in total). For each k-classification experiment, we do 5 groups by increasing the size of the training set. Observe changes in accuracy to analyze the effect of data set size on difference. For the same CNN-based network, experimental results of average accuracy illustrate that the larger the training set, the higher the test accuracy. However, when the training data set are insufficient, better results can be obtained. Furthermore, in each group experiment, the more categories that are classified, the more obvious the performance change. Results of this paper not only can guide us to do experiments on image classification, but also have important guiding significance for other experiments based on deep learning.