Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai
{"title":"超参数对deep plabv3 + RGB图像水体分割性能的影响","authors":"Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai","doi":"10.5194/isprs-archives-xlviii-m-3-2023-203-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.\n","PeriodicalId":30634,"journal":{"name":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EFFECT OF HYPERPARAMETERS ON DEEPLABV3+ PERFORMANCE TO SEGMENT WATER BODIES IN RGB IMAGES\",\"authors\":\"Onteddu Chaitanya Reddy, Illa Dinesh Kumar, Pingali Sathvika, Sajith Variyar, Sowmya, R. Sivanpillai\",\"doi\":\"10.5194/isprs-archives-xlviii-m-3-2023-203-2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.\\n\",\"PeriodicalId\":30634,\"journal\":{\"name\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-203-2023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International Archives of the Photogrammetry Remote Sensing and Spatial Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/isprs-archives-xlviii-m-3-2023-203-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
EFFECT OF HYPERPARAMETERS ON DEEPLABV3+ PERFORMANCE TO SEGMENT WATER BODIES IN RGB IMAGES
Abstract. Deep Learning (DL) networks used in image segmentation tasks must be trained with input images and corresponding masks that identify target features in them. DL networks learn by iteratively adjusting the weights of interconnected layers using backpropagation, a process that involves calculating gradients and minimizing a loss function. This allows the network to learn patterns and relationships in the data, enabling it to make predictions or classifications on new, unseen data. Training any DL network requires specifying values of the hyperparameters such as input image size, batch size, and number of epochs among others. Failure to specify optimal values for the parameters will increase the training time or result in incomplete learning. The rationale of this study was to evaluate the effect of input image and batch sizes on the performance of DeepLabV3+ using Sentinel 2 A/B RGB images and labels obtained from Kaggle. We trained DeepLabV3+ network six times with two sets of input images of 128 × 128-pixel, and 256 × 256-pixel dimensions with 4, 8 and 16 batch sizes. The model is trained for 100 epochs to ensure that the loss plot reaches saturation and the model converged to a stable solution. Predicted masks generated by each model were compared to their corresponding test mask images based on accuracy, precision, recall and F1 scores. Results from this study demonstrated that image size of 256 × 256 and batch size 4 achieved highest performance. It can also be inferred that larger input image size improved DeepLabV3+ model performance.