基于卷积神经网络的海滩线性透视图像自动分类

M. Santos-Romero, Javier Arellano-Verdejo, H. Lazcano-Hernandez, Pedro Damián-Reyes
{"title":"基于卷积神经网络的海滩线性透视图像自动分类","authors":"M. Santos-Romero, Javier Arellano-Verdejo, H. Lazcano-Hernandez, Pedro Damián-Reyes","doi":"10.1109/ENC56672.2022.9882952","DOIUrl":null,"url":null,"abstract":"Since 2018, Sargassum stranding has increased on the beaches of the Caribbean Sea. Sargassum monitoring through satellite platforms has allowed us to probe its dynamics at a regional scale (mesoscale). However, these methods are not always used for monitoring at the beach level (human scale). Information is required at this level so that the authorities in charge of Sargassum management can design suitable strategies to address stranding. Artificial Intelligence techniques such as Machine Learning (ML) have recently been incorporated for beach-level data analysis. This has required the construction of new data sets. Due to the beach length, various techniques are used for image collection. One of these techniques is Crowdsourcing. To use only images that possess the visual and framing characteristics useful for the calculation of Sargassum coverage on a beach, imagery submitted by the participants must be classified before analysis. The present study proposes a method of automatic image classification with an in-depth linear perspective that allows build adequate datasets for studies on Sargassum accumulated that require highlighting the presence of this macroalgae along the beaches. The detection of Sargassum in the images is outside the scope of this study. For this purpose, was implemented on three neural networks from the literature the Transfer Learning technique. ResNet50, MobileNetv2, and VGG16 were re-trained with a dataset of 5,000 balanced samples of images with and without linear beach perspective. After several experiments, an accuracy of 80%, 90%, and 91%, respectively, was achieved. To complement the information provided by the Accuracy, Precision, and F1 score metrics, a resampling analysis of the F1-Score metric was performed with the Bootstrapping technique, concluding that VGG16 and MobileNetV2 obtained similar results with a confidence interval of 95% between 0.88 and 0.92. Finally, considering the spatial and temporal complexity of both architectures, MobileNetV2 is proposed as the most appropriate for the classification of images with and without linear perspective in depth. The classification of this type of image, in addition to requiring new data sets, takes the application of ML to another level, since in addition to the geometry or color, the model has to discriminate the texture and spatial distribution of the elements in the image, in order to successfully perform the classification.","PeriodicalId":145622,"journal":{"name":"2022 IEEE Mexican International Conference on Computer Science (ENC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automatic classification of images with beach linear perspective using convolutional neural networks\",\"authors\":\"M. Santos-Romero, Javier Arellano-Verdejo, H. Lazcano-Hernandez, Pedro Damián-Reyes\",\"doi\":\"10.1109/ENC56672.2022.9882952\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since 2018, Sargassum stranding has increased on the beaches of the Caribbean Sea. Sargassum monitoring through satellite platforms has allowed us to probe its dynamics at a regional scale (mesoscale). However, these methods are not always used for monitoring at the beach level (human scale). Information is required at this level so that the authorities in charge of Sargassum management can design suitable strategies to address stranding. Artificial Intelligence techniques such as Machine Learning (ML) have recently been incorporated for beach-level data analysis. This has required the construction of new data sets. Due to the beach length, various techniques are used for image collection. One of these techniques is Crowdsourcing. To use only images that possess the visual and framing characteristics useful for the calculation of Sargassum coverage on a beach, imagery submitted by the participants must be classified before analysis. The present study proposes a method of automatic image classification with an in-depth linear perspective that allows build adequate datasets for studies on Sargassum accumulated that require highlighting the presence of this macroalgae along the beaches. The detection of Sargassum in the images is outside the scope of this study. For this purpose, was implemented on three neural networks from the literature the Transfer Learning technique. ResNet50, MobileNetv2, and VGG16 were re-trained with a dataset of 5,000 balanced samples of images with and without linear beach perspective. After several experiments, an accuracy of 80%, 90%, and 91%, respectively, was achieved. To complement the information provided by the Accuracy, Precision, and F1 score metrics, a resampling analysis of the F1-Score metric was performed with the Bootstrapping technique, concluding that VGG16 and MobileNetV2 obtained similar results with a confidence interval of 95% between 0.88 and 0.92. Finally, considering the spatial and temporal complexity of both architectures, MobileNetV2 is proposed as the most appropriate for the classification of images with and without linear perspective in depth. The classification of this type of image, in addition to requiring new data sets, takes the application of ML to another level, since in addition to the geometry or color, the model has to discriminate the texture and spatial distribution of the elements in the image, in order to successfully perform the classification.\",\"PeriodicalId\":145622,\"journal\":{\"name\":\"2022 IEEE Mexican International Conference on Computer Science (ENC)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Mexican International Conference on Computer Science (ENC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ENC56672.2022.9882952\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Mexican International Conference on Computer Science (ENC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ENC56672.2022.9882952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

自2018年以来,加勒比海海滩上搁浅的马尾藻数量有所增加。通过卫星平台监测马尾藻使我们能够在区域尺度(中尺度)上探测其动态。然而,这些方法并不总是用于海滩水平(人类尺度)的监测。这一级需要信息,以便负责马尾藻管理的当局能够制定适当的战略来解决搁浅问题。最近,机器学习(ML)等人工智能技术已被纳入海滩级数据分析。这就需要构建新的数据集。由于海滩的长度,图像采集使用了各种技术。其中一项技术是众包。为了只使用具有视觉和框架特征的图像,这些图像对计算马尾藻在海滩上的覆盖范围有用,参与者提交的图像必须在分析之前进行分类。本研究提出了一种具有深度线性视角的自动图像分类方法,该方法允许建立足够的数据集,用于需要突出显示海滩沿线这种大型藻类存在的马尾藻积累的研究。图像中马尾藻的检测不在本研究范围内。为此,从文献中对三个神经网络实现了迁移学习技术。ResNet50, MobileNetv2和VGG16使用5000个平衡图像样本的数据集重新训练,这些图像样本有和没有线性海滩透视。经过多次实验,分别达到了80%、90%和91%的准确率。为了补充Accuracy、Precision和F1评分指标提供的信息,使用Bootstrapping技术对F1- score指标进行了重采样分析,得出结论:VGG16和MobileNetV2获得了相似的结果,95%的置信区间在0.88和0.92之间。最后,考虑到这两种架构的时空复杂性,提出了MobileNetV2最适合用于有和没有深度线性透视的图像分类。这类图像的分类除了需要新的数据集之外,还将机器学习的应用提升到了另一个层次,因为除了几何形状或颜色之外,模型还必须区分图像中元素的纹理和空间分布,才能成功地执行分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automatic classification of images with beach linear perspective using convolutional neural networks
Since 2018, Sargassum stranding has increased on the beaches of the Caribbean Sea. Sargassum monitoring through satellite platforms has allowed us to probe its dynamics at a regional scale (mesoscale). However, these methods are not always used for monitoring at the beach level (human scale). Information is required at this level so that the authorities in charge of Sargassum management can design suitable strategies to address stranding. Artificial Intelligence techniques such as Machine Learning (ML) have recently been incorporated for beach-level data analysis. This has required the construction of new data sets. Due to the beach length, various techniques are used for image collection. One of these techniques is Crowdsourcing. To use only images that possess the visual and framing characteristics useful for the calculation of Sargassum coverage on a beach, imagery submitted by the participants must be classified before analysis. The present study proposes a method of automatic image classification with an in-depth linear perspective that allows build adequate datasets for studies on Sargassum accumulated that require highlighting the presence of this macroalgae along the beaches. The detection of Sargassum in the images is outside the scope of this study. For this purpose, was implemented on three neural networks from the literature the Transfer Learning technique. ResNet50, MobileNetv2, and VGG16 were re-trained with a dataset of 5,000 balanced samples of images with and without linear beach perspective. After several experiments, an accuracy of 80%, 90%, and 91%, respectively, was achieved. To complement the information provided by the Accuracy, Precision, and F1 score metrics, a resampling analysis of the F1-Score metric was performed with the Bootstrapping technique, concluding that VGG16 and MobileNetV2 obtained similar results with a confidence interval of 95% between 0.88 and 0.92. Finally, considering the spatial and temporal complexity of both architectures, MobileNetV2 is proposed as the most appropriate for the classification of images with and without linear perspective in depth. The classification of this type of image, in addition to requiring new data sets, takes the application of ML to another level, since in addition to the geometry or color, the model has to discriminate the texture and spatial distribution of the elements in the image, in order to successfully perform the classification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信