Transferred Parallel Convolutional Neural Network for Large Imbalanced Plankton Database Classification

Chao Wang, Xueer Zheng, Chunfeng Guo, Zhibin Yu, Jia Yu, Haiyong Zheng, Bing Zheng
{"title":"Transferred Parallel Convolutional Neural Network for Large Imbalanced Plankton Database Classification","authors":"Chao Wang, Xueer Zheng, Chunfeng Guo, Zhibin Yu, Jia Yu, Haiyong Zheng, Bing Zheng","doi":"10.1109/OCEANSKOBE.2018.8558836","DOIUrl":null,"url":null,"abstract":"Plankton are critically important to our ecosystem, accounting for more than half the primary productivity on earth and nearly half the total carbon fixed in the global carbon cycle. Loss of plankton populations could result in ecological upheaval as well as negative societal impacts. By contrast, a bloom of phytoplankton can result in red tides which will cause huge economic loss. So it's a valuable thing for people to get the species population and distribution information. Recently, convolutional neural networks have achieved state of the art result on large scale image classification. We use several popular CNN models on WHOI large scale plankton database, it has achieved high accuracy on this dataset, but the data distribution of WHOI is not balance, so we have to solve a data imbalance problem. To evaluate the classier in an impartial way, we introduce an evaluation criterion called F1 score. Although the CNN method have achieved high global accuracy on the database, they achieved low F1 score: 0.17, 0.29 on CIFAR10 CNN model and VGG16 model separately. In this paper, we introduced a transfer parallel model approach to overcome this problem. We pre-trained a CNN model on the small classes which have images less than 5,000. Then the pre-trained model was treated as a feature extractor to enhance the small class's features and we fixed all the weights of this pre-trained model and combined with a parallel network to train on the whole training database. Through this transferred feature based approach we achieved high F1 score 0.3752, 0.5444 with our model based on CIFAR10 CNN model and VGG16 model respectively.","PeriodicalId":441405,"journal":{"name":"2018 OCEANS - MTS/IEEE Kobe Techno-Oceans (OTO)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 OCEANS - MTS/IEEE Kobe Techno-Oceans (OTO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/OCEANSKOBE.2018.8558836","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Plankton are critically important to our ecosystem, accounting for more than half the primary productivity on earth and nearly half the total carbon fixed in the global carbon cycle. Loss of plankton populations could result in ecological upheaval as well as negative societal impacts. By contrast, a bloom of phytoplankton can result in red tides which will cause huge economic loss. So it's a valuable thing for people to get the species population and distribution information. Recently, convolutional neural networks have achieved state of the art result on large scale image classification. We use several popular CNN models on WHOI large scale plankton database, it has achieved high accuracy on this dataset, but the data distribution of WHOI is not balance, so we have to solve a data imbalance problem. To evaluate the classier in an impartial way, we introduce an evaluation criterion called F1 score. Although the CNN method have achieved high global accuracy on the database, they achieved low F1 score: 0.17, 0.29 on CIFAR10 CNN model and VGG16 model separately. In this paper, we introduced a transfer parallel model approach to overcome this problem. We pre-trained a CNN model on the small classes which have images less than 5,000. Then the pre-trained model was treated as a feature extractor to enhance the small class's features and we fixed all the weights of this pre-trained model and combined with a parallel network to train on the whole training database. Through this transferred feature based approach we achieved high F1 score 0.3752, 0.5444 with our model based on CIFAR10 CNN model and VGG16 model respectively.
大型不平衡浮游生物数据库分类的转移并行卷积神经网络
浮游生物对我们的生态系统至关重要,占地球初级生产力的一半以上,占全球碳循环中固定碳总量的近一半。浮游生物数量的减少可能导致生态剧变以及负面的社会影响。相比之下,浮游植物的大量繁殖会导致赤潮,造成巨大的经济损失。所以获取物种数量和分布信息对人们来说是很有价值的。近年来,卷积神经网络在大规模图像分类方面取得了较好的研究成果。我们在WHOI大型浮游生物数据库上使用了几种流行的CNN模型,在该数据集上取得了较高的准确率,但WHOI的数据分布并不均衡,因此我们要解决一个数据不平衡的问题。为了公正地评价分类器,我们引入了一个称为F1分数的评价标准。CNN方法虽然在数据库上取得了较高的全局精度,但在CIFAR10 CNN模型和VGG16模型上的F1得分较低,分别为0.17、0.29。在本文中,我们引入了一种迁移并行模型方法来克服这一问题。我们在图像少于5000张的小班上预训练了CNN模型。然后将预训练模型作为特征提取器来增强小班的特征,并对预训练模型的所有权值进行固定,结合并行网络对整个训练库进行训练。通过这种基于转移特征的方法,我们的模型分别基于CIFAR10 CNN模型和VGG16模型获得了较高的F1分数0.3752、0.5444。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信