{"title":"MLRS-CNN-DWTPL:基于小波池层的深度神经网络多标签遥感场景分类新方法","authors":"S. El-Khamy, A. Al-Kabbany, Shimaa El-bana","doi":"10.1109/ITC-Egypt52936.2021.9513885","DOIUrl":null,"url":null,"abstract":"Aerial scene classification using multi-label remote sensing (MLRS) is a remote sensing challenge task. Conventional techniques in this research area have mainly focused either on the simplified single-label case or on pixel-based approaches, which cannot efficiently handle high-resolution images. Deep learning (DL) and convolutional neural networks (CNNs) have defined the state-of-the-art in many vision problems in recent years. CNNs often adopt pooling layers to enlarge the receptive field, which can lower computational complexity. On the other hand, Conventional pooling methods can result in data loss, degrading subsequent operations such as feature extraction, image retrieval, and scene analysis. Inspired by this drawback, we propose a new CNN model by investigating the impact of discrete wavelet transform pooling (DWTPL) on the performance of this model. Wavelet pooling allows us to utilize spectral information, which is crucial in multi-label remote sensing tasks. We show consistent improvements in precision and F1-score on a widely adopted AID dataset compared to other models from the recent literature.","PeriodicalId":321025,"journal":{"name":"2021 International Telecommunications Conference (ITC-Egypt)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"MLRS-CNN-DWTPL: A New Enhanced Multi-Label Remote Sensing Scene Classification Using Deep Neural Networks with Wavelet Pooling Layers\",\"authors\":\"S. El-Khamy, A. Al-Kabbany, Shimaa El-bana\",\"doi\":\"10.1109/ITC-Egypt52936.2021.9513885\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aerial scene classification using multi-label remote sensing (MLRS) is a remote sensing challenge task. Conventional techniques in this research area have mainly focused either on the simplified single-label case or on pixel-based approaches, which cannot efficiently handle high-resolution images. Deep learning (DL) and convolutional neural networks (CNNs) have defined the state-of-the-art in many vision problems in recent years. CNNs often adopt pooling layers to enlarge the receptive field, which can lower computational complexity. On the other hand, Conventional pooling methods can result in data loss, degrading subsequent operations such as feature extraction, image retrieval, and scene analysis. Inspired by this drawback, we propose a new CNN model by investigating the impact of discrete wavelet transform pooling (DWTPL) on the performance of this model. Wavelet pooling allows us to utilize spectral information, which is crucial in multi-label remote sensing tasks. We show consistent improvements in precision and F1-score on a widely adopted AID dataset compared to other models from the recent literature.\",\"PeriodicalId\":321025,\"journal\":{\"name\":\"2021 International Telecommunications Conference (ITC-Egypt)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Telecommunications Conference (ITC-Egypt)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITC-Egypt52936.2021.9513885\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Telecommunications Conference (ITC-Egypt)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-Egypt52936.2021.9513885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
MLRS-CNN-DWTPL: A New Enhanced Multi-Label Remote Sensing Scene Classification Using Deep Neural Networks with Wavelet Pooling Layers
Aerial scene classification using multi-label remote sensing (MLRS) is a remote sensing challenge task. Conventional techniques in this research area have mainly focused either on the simplified single-label case or on pixel-based approaches, which cannot efficiently handle high-resolution images. Deep learning (DL) and convolutional neural networks (CNNs) have defined the state-of-the-art in many vision problems in recent years. CNNs often adopt pooling layers to enlarge the receptive field, which can lower computational complexity. On the other hand, Conventional pooling methods can result in data loss, degrading subsequent operations such as feature extraction, image retrieval, and scene analysis. Inspired by this drawback, we propose a new CNN model by investigating the impact of discrete wavelet transform pooling (DWTPL) on the performance of this model. Wavelet pooling allows us to utilize spectral information, which is crucial in multi-label remote sensing tasks. We show consistent improvements in precision and F1-score on a widely adopted AID dataset compared to other models from the recent literature.