Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani
{"title":"面向MobileNet的空间卷积金字塔池化","authors":"Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani","doi":"10.1109/SCOReD50371.2020.9250928","DOIUrl":null,"url":null,"abstract":"Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\\times$$,\\times$, and $\\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.","PeriodicalId":142867,"journal":{"name":"2020 IEEE Student Conference on Research and Development (SCOReD)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Spatial Pyramid Pooling with Atrous Convolutional for MobileNet\",\"authors\":\"Nur Ayuni Mohamed, M. A. Zulkifley, Siti Raihanah Abdani\",\"doi\":\"10.1109/SCOReD50371.2020.9250928\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\\\\times$$,\\\\times$, and $\\\\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.\",\"PeriodicalId\":142867,\"journal\":{\"name\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Student Conference on Research and Development (SCOReD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCOReD50371.2020.9250928\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Student Conference on Research and Development (SCOReD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCOReD50371.2020.9250928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Spatial Pyramid Pooling with Atrous Convolutional for MobileNet
Disease screening through the fundus image is one of the hottest research topics in biomedical engineering. There are various diseases that can be screened through human retinal information, which include glaucoma, myopia, macular degeneration, diabetic retinopathy, and cataracts. Hence, an automated system to screen all these diseases will be beneficial to health practitioners. Previously, each of the disease features needs to be designed by hand if the traditional machine learning approach is used. It is hard to process various diseases as a single system through this approach, especially if a new disease that needs to be added to the system does not fit well with the handcrafted features. Thus, a deep learning approach that utilizes learned features is the better alternative as the model can be updated easily if a new disease wants to be added to the system. This paper proposes a modified MobileNet architecture by replacing the top layers with a spatial pyramid pooling module. Three parallel flows of max-pooling operation through kernel sizes of $\times$$,\times$, and $\times$ are implemented to improve the algorithm robustness towards multi-scale input. Atrous convolution is also employed by adding the dilation rate to each of the pointwise convolution operators. The results show that a dilation rate of 4 produces the best mean accuracy of 0.7433 for the 5-fold cross-validation test. The algorithm retains its lightweight nature where the total number of parameters used is around 3 million. The model can be trained better if the number of data among the classes is more or less the same, which will reduce the training bias.