超参数对deep plabv3 + RGB图像水体分割性能的影响

Q2 Social Sciences
Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai
{"title":"超参数对deep plabv3 + RGB图像水体分割性能的影响","authors":"Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai","doi":"10.5194/isprs-archives-xlviii-m-3-2023-203-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.\n","PeriodicalId":30634,"journal":{"name":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EFFECT OF HYPERPARAMETERS ON DEEPLABV3+ PERFORMANCE TO SEGMENT WATER BODIES IN RGB IMAGES\",\"authors\":\"Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai\",\"doi\":\"10.5194/isprs-archives-xlviii-m-3-2023-203-2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.\\n\",\"PeriodicalId\":30634,\"journal\":{\"name\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-203-2023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-203-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

摘要

摘要用于图像分割任务的深度学习(DL)网络必须使用输入图像和相应的掩码进行训练,以识别其中的目标特征。深度学习网络通过使用反向传播迭代调整互连层的权重来学习,这一过程涉及计算梯度和最小化损失函数。这使得网络能够学习数据中的模式和关系,使其能够对新的、未见过的数据进行预测或分类。训练任何深度学习网络都需要指定超参数的值,如输入图像大小、批处理大小和epoch数量等。如果不能指定参数的最优值,则会增加训练时间或导致学习不完全。本研究的基本原理是使用Sentinel 2 A/B RGB图像和从Kaggle获得的标签来评估输入图像和批大小对DeepLabV3+性能的影响。我们对DeepLabV3+网络进行了6次训练,输入图像尺寸分别为128 × 128像素和256 × 256像素,batch size分别为4、8和16。对模型进行100次epoch的训练,以保证损失图达到饱和,模型收敛到稳定解。根据准确率、精密度、召回率和F1分数,将每个模型生成的预测掩模与相应的测试掩模图像进行比较。研究结果表明,图像大小为256 × 256和批处理大小为4时,获得了最高的性能。也可以推断,更大的输入图像尺寸提高了DeepLabV3+模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
EFFECT OF HYPERPARAMETERS ON DEEPLABV3+ PERFORMANCE TO SEGMENT WATER BODIES IN RGB IMAGES
Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.70
自引率
0.00%
发文量
949
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信