{"title":"Remote Sensing Image Classification Via Vision Transformer and Transfer Learning","authors":"M. Khan, Muhammad Rajwana","doi":"10.15849/ijasca.220328.14","DOIUrl":null,"url":null,"abstract":"Abstract Aerial scene classification, which aims to automatically tag an aerial image with a specific semantic category, is a fundamental problem for understanding high-resolution remote sensing imagery. The classification of remote sensing image scenes can provide significant value, from forest fire monitoring to land use and land cover classification. From the first aerial photographs of the early 20th century to today's satellite imagery, the amount of remote sensing data has increased geometrically with higher resolution. The need to analyze this modern digital data has motivated research to accelerate the classification of remotely sensed images. Fortunately, the computer vision community has made great strides in classifying natural images. Transformers first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to apply transformers to computer vision tasks. In a variety of visual benchmarks, transformer-based models perform similar to or better than other types of networks such as convolutional and recurrent networks. Given its high performance and less need for vision-specific inductive bias, the transformer is receiving more and more attention from the computer vision community. In this paper, we provide a systematic review of the Transfer Learning and Transformer techniques for scene classification using AID datasets. Both approaches give an accuracy of 80% and 84%, for the AID dataset. Keywords: remote sensing, vision transformers, transfer learning, classification accuracy","PeriodicalId":38638,"journal":{"name":"International Journal of Advances in Soft Computing and its Applications","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advances in Soft Computing and its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15849/ijasca.220328.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Aerial scene classification, which aims to automatically tag an aerial image with a specific semantic category, is a fundamental problem for understanding high-resolution remote sensing imagery. The classification of remote sensing image scenes can provide significant value, from forest fire monitoring to land use and land cover classification. From the first aerial photographs of the early 20th century to today's satellite imagery, the amount of remote sensing data has increased geometrically with higher resolution. The need to analyze this modern digital data has motivated research to accelerate the classification of remotely sensed images. Fortunately, the computer vision community has made great strides in classifying natural images. Transformers first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to apply transformers to computer vision tasks. In a variety of visual benchmarks, transformer-based models perform similar to or better than other types of networks such as convolutional and recurrent networks. Given its high performance and less need for vision-specific inductive bias, the transformer is receiving more and more attention from the computer vision community. In this paper, we provide a systematic review of the Transfer Learning and Transformer techniques for scene classification using AID datasets. Both approaches give an accuracy of 80% and 84%, for the AID dataset. Keywords: remote sensing, vision transformers, transfer learning, classification accuracy
期刊介绍:
The aim of this journal is to provide a lively forum for the communication of original research papers and timely review articles on Advances in Soft Computing and Its Applications. IJASCA will publish only articles of the highest quality. Submissions will be evaluated on their originality and significance. IJASCA invites submissions in all areas of Soft Computing and Its Applications. The scope of the journal includes, but is not limited to: √ Soft Computing Fundamental and Optimization √ Soft Computing for Big Data Era √ GPU Computing for Machine Learning √ Soft Computing Modeling for Perception and Spiritual Intelligence √ Soft Computing and Agents Technology √ Soft Computing in Computer Graphics √ Soft Computing and Pattern Recognition √ Soft Computing in Biomimetic Pattern Recognition √ Data mining for Social Network Data √ Spatial Data Mining & Information Retrieval √ Intelligent Software Agent Systems and Architectures √ Advanced Soft Computing and Multi-Objective Evolutionary Computation √ Perception-Based Intelligent Decision Systems √ Spiritual-Based Intelligent Systems √ Soft Computing in Industry ApplicationsOther issues related to the Advances of Soft Computing in various applications.