M. Santos-Romero, Javier Arellano-Verdejo, H. Lazcano-Hernandez, Pedro Damián-Reyes
{"title":"Automatic classification of images with beach linear perspective using convolutional neural networks","authors":"M. Santos-Romero, Javier Arellano-Verdejo, H. Lazcano-Hernandez, Pedro Damián-Reyes","doi":"10.1109/ENC56672.2022.9882952","DOIUrl":null,"url":null,"abstract":"Since 2018, Sargassum stranding has increased on the beaches of the Caribbean Sea. Sargassum monitoring through satellite platforms has allowed us to probe its dynamics at a regional scale (mesoscale). However, these methods are not always used for monitoring at the beach level (human scale). Information is required at this level so that the authorities in charge of Sargassum management can design suitable strategies to address stranding. Artificial Intelligence techniques such as Machine Learning (ML) have recently been incorporated for beach-level data analysis. This has required the construction of new data sets. Due to the beach length, various techniques are used for image collection. One of these techniques is Crowdsourcing. To use only images that possess the visual and framing characteristics useful for the calculation of Sargassum coverage on a beach, imagery submitted by the participants must be classified before analysis. The present study proposes a method of automatic image classification with an in-depth linear perspective that allows build adequate datasets for studies on Sargassum accumulated that require highlighting the presence of this macroalgae along the beaches. The detection of Sargassum in the images is outside the scope of this study. For this purpose, was implemented on three neural networks from the literature the Transfer Learning technique. ResNet50, MobileNetv2, and VGG16 were re-trained with a dataset of 5,000 balanced samples of images with and without linear beach perspective. After several experiments, an accuracy of 80%, 90%, and 91%, respectively, was achieved. To complement the information provided by the Accuracy, Precision, and F1 score metrics, a resampling analysis of the F1-Score metric was performed with the Bootstrapping technique, concluding that VGG16 and MobileNetV2 obtained similar results with a confidence interval of 95% between 0.88 and 0.92. Finally, considering the spatial and temporal complexity of both architectures, MobileNetV2 is proposed as the most appropriate for the classification of images with and without linear perspective in depth. The classification of this type of image, in addition to requiring new data sets, takes the application of ML to another level, since in addition to the geometry or color, the model has to discriminate the texture and spatial distribution of the elements in the image, in order to successfully perform the classification.","PeriodicalId":145622,"journal":{"name":"2022 IEEE Mexican International Conference on Computer Science (ENC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Mexican International Conference on Computer Science (ENC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ENC56672.2022.9882952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Since 2018, Sargassum stranding has increased on the beaches of the Caribbean Sea. Sargassum monitoring through satellite platforms has allowed us to probe its dynamics at a regional scale (mesoscale). However, these methods are not always used for monitoring at the beach level (human scale). Information is required at this level so that the authorities in charge of Sargassum management can design suitable strategies to address stranding. Artificial Intelligence techniques such as Machine Learning (ML) have recently been incorporated for beach-level data analysis. This has required the construction of new data sets. Due to the beach length, various techniques are used for image collection. One of these techniques is Crowdsourcing. To use only images that possess the visual and framing characteristics useful for the calculation of Sargassum coverage on a beach, imagery submitted by the participants must be classified before analysis. The present study proposes a method of automatic image classification with an in-depth linear perspective that allows build adequate datasets for studies on Sargassum accumulated that require highlighting the presence of this macroalgae along the beaches. The detection of Sargassum in the images is outside the scope of this study. For this purpose, was implemented on three neural networks from the literature the Transfer Learning technique. ResNet50, MobileNetv2, and VGG16 were re-trained with a dataset of 5,000 balanced samples of images with and without linear beach perspective. After several experiments, an accuracy of 80%, 90%, and 91%, respectively, was achieved. To complement the information provided by the Accuracy, Precision, and F1 score metrics, a resampling analysis of the F1-Score metric was performed with the Bootstrapping technique, concluding that VGG16 and MobileNetV2 obtained similar results with a confidence interval of 95% between 0.88 and 0.92. Finally, considering the spatial and temporal complexity of both architectures, MobileNetV2 is proposed as the most appropriate for the classification of images with and without linear perspective in depth. The classification of this type of image, in addition to requiring new data sets, takes the application of ML to another level, since in addition to the geometry or color, the model has to discriminate the texture and spatial distribution of the elements in the image, in order to successfully perform the classification.