Spatial Pyramid Pooling with Atrous Convolutional for MobileNet

Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani
{"title":"Spatial Pyramid Pooling with Atrous Convolutional for MobileNet","authors":"Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani","doi":"10.1109/SCOReD50371.2020.9250928","DOIUrl":null,"url":null,"abstract":"Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\\times$$,\\times$, and $\\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.","PeriodicalId":142867,"journal":{"name":"2020 IEEE Student Conference on Research and Development (SCOReD)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Student Conference on Research and Development (SCOReD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCOReD50371.2020.9250928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\times$$,\times$, and $\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.
面向MobileNet的空间卷积金字塔池化
眼底图像疾病筛查是生物医学工程领域的研究热点之一。通过人体视网膜信息可以筛查各种疾病,包括青光眼、近视、黄斑变性、糖尿病视网膜病变和白内障。因此,一个自动化的系统来筛选所有这些疾病将有利于健康从业者。以前,如果使用传统的机器学习方法,每个疾病特征都需要手工设计。通过这种方法很难将各种疾病作为一个单一的系统来处理,特别是如果需要添加到系统中的新疾病与手工制作的功能不太匹配。因此,利用学习特征的深度学习方法是更好的选择,因为如果想要将新的疾病添加到系统中,模型可以很容易地更新。本文提出了一种改进的MobileNet架构,将顶层替换为空间金字塔池模块。通过内核大小$\times$、$ times$和$\times$实现了三个并行的最大池化操作流,以提高算法对多尺度输入的鲁棒性。通过向每个逐点卷积算子添加膨胀率,也采用了非均匀卷积。结果表明,在5次交叉验证试验中,膨胀率为4时,平均准确度为0.7433。该算法保留了其轻量级的性质,使用的参数总数约为300万个。如果类之间的数据数量大致相同,则可以更好地训练模型,从而减少训练偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信