面向MobileNet的空间卷积金字塔池化

2020 IEEE Student Conference on Research and Development (SCOReD) Pub Date : 2020-09-27 DOI:10.1109/SCOReD50371.2020.9250928

Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani

{"title":"面向MobileNet的空间卷积金字塔池化","authors":"Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani","doi":"10.1109/SCOReD50371.2020.9250928","DOIUrl":null,"url":null,"abstract":"Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\\times$$,\\times$, and $\\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.","PeriodicalId":142867,"journal":{"name":"2020 IEEE Student Conference on Research and Development (SCOReD)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Spatial Pyramid Pooling with Atrous Convolutional for MobileNet\",\"authors\":\"Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani\",\"doi\":\"10.1109/SCOReD50371.2020.9250928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\\\\times$$,\\\\times$, and $\\\\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.\",\"PeriodicalId\":142867,\"journal\":{\"name\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCOReD50371.2020.9250928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Student Conference on Research and Development (SCOReD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCOReD50371.2020.9250928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

眼底图像疾病筛查是生物医学工程领域的研究热点之一。通过人体视网膜信息可以筛查各种疾病，包括青光眼、近视、黄斑变性、糖尿病视网膜病变和白内障。因此，一个自动化的系统来筛选所有这些疾病将有利于健康从业者。以前，如果使用传统的机器学习方法，每个疾病特征都需要手工设计。通过这种方法很难将各种疾病作为一个单一的系统来处理，特别是如果需要添加到系统中的新疾病与手工制作的功能不太匹配。因此，利用学习特征的深度学习方法是更好的选择，因为如果想要将新的疾病添加到系统中，模型可以很容易地更新。本文提出了一种改进的MobileNet架构，将顶层替换为空间金字塔池模块。通过内核大小$\times$、$ times$和$\times$实现了三个并行的最大池化操作流，以提高算法对多尺度输入的鲁棒性。通过向每个逐点卷积算子添加膨胀率，也采用了非均匀卷积。结果表明，在5次交叉验证试验中，膨胀率为4时，平均准确度为0.7433。该算法保留了其轻量级的性质，使用的参数总数约为300万个。如果类之间的数据数量大致相同，则可以更好地训练模型，从而减少训练偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Spatial Pyramid Pooling with Atrous Convolutional for MobileNet

Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\times$$,\times$, and $\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE Student Conference on Research and Development (SCOReD)

自引率

0.00%

发文量