{"title":"UMTF-Net: An Unsupervised Multiscale Transformer Fusion Network for Hyperspectral and Multispectral Image Fusion","authors":"Shuaiqi Liu;Shichong Zhang;Siyuan Liu;Bing Li;Yu-Dong Zhang","doi":"10.1109/JSTARS.2024.3461152","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3461152","url":null,"abstract":"Hyperspectral images (HSIs) are extensively utilized in several fields due to their abundant spectral band, particularly for tasks like ground object classification and environmental monitoring. However, as a result of equipment and imaging condition constraints, HSI frequently demonstrates a restricted spatial resolution. The fusion of a low-resolution HSI and a high-resolution multispectral image (HR-MSI) of the same scene is a crucial method for generating an HR-HSI. At present, due to factors, such as complexity and GPU memory limitation, most of the HSI–MSI fusion algorithms based on deep learning (DL) cannot utilize the transformer module well to capture the long-range dependence information in large-size remote sensing images. At the same time, the lack of a large amount of high-quality training data has become an important problem that affects the performance of fusion algorithms based on DL. In response to the above issues, this article introduces a new unsupervised multiscale transformer fusion (UMTF) network, called UMTF-Net, which enables HSI–MSI fusion without the need for additional training data. UMTF-Net is composed of an HSI fusion network and a U-network (U-Net)-based multiscale feature extraction network. In order to learn the cross-feature spatial similarity and long-range dependency of MSI and HSI, we first extract the multiscale features of MSI using the U-Net-based multiscale feature extraction network. We then input these features into the corresponding scale cross-feature fusion transformer module in the HSI fusion network to conduct feature fusion. Then, we input the fused features into the spatial spectral fuse attention module for spatial spectral feature enhancement, and finally generate HR-HSI. Comparing UMTF-Net to other advanced methods, the fusion results from three datasets and multiple ablation experiments indicate that our method performs excellently in different evaluations.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10680579","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142430751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Two-Step Motion Compensation Method for Polar Format Images of Terahertz SAR Based on Echo Data","authors":"Shaowen Luo;Qiuyan Wang;Yinwei Li;Xiaolong Chen;Yiming Zhu","doi":"10.1109/JSTARS.2024.3461332","DOIUrl":"10.1109/JSTARS.2024.3461332","url":null,"abstract":"Terahertz synthetic aperture radar (THz SAR) has great potential in the field of remote sensing due to its high resolution and high frame rate. However, THz SAR is very sensitive to motion errors, making even small 2-D error caused by motion error and polar format algorithm (PFA) seriously affect the image quality. Existing microwave SAR autofocusing methods only estimate the error of a single dimension, which cannot meet the accuracy requirements of THz SAR for 2-D error compensation. In this article, a two-step motion compensation method for polar format images of THz SAR based on echo data is proposed. First, by analyzing the conversion model of polar coordinate format, the node where the 2-D error coupling occurs is determined. On this basis, a coarse compensation based on low-frequency fitting is proposed in front of this node to reduce the influence of PFA on the error coupling of 2-D signals. The method not only preserves the correction of the inherent range cell migration by PFA but also eliminates the interference of PFA to the subsequent error compensation processing. Second, to solve the problem that a single compensation method cannot meet the accuracy requirements of THz SAR, the maximum contrast method after polar coordinate format conversion is used for precision compensation. The effectiveness of the proposed method is verified through simulation and actual measurement data processing of 0.22-THz airborne spotlight SAR system.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10680337","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Xie;Tingting Tian;Richa Hu;Xuan Yang;Yue Xu;Luyang Zan
{"title":"A Novel Detector for Wind Turbines in Wide-Ranging, Multiscene Remote Sensing Images","authors":"Jun Xie;Tingting Tian;Richa Hu;Xuan Yang;Yue Xu;Luyang Zan","doi":"10.1109/JSTARS.2024.3460730","DOIUrl":"10.1109/JSTARS.2024.3460730","url":null,"abstract":"Wind turbines are one of the important carriers of clean energy utilization. Accurately and rapidly detecting wind turbine objects in large-scale remote sensing images can effectively monitor the development activities and optimize energy utilization. Addressing the detection challenges posed by the complex distribution scenes and the slender, dispersed structural characteristics of wind turbines in remote sensing images, this article proposes a remote sensing image wind turbine detector, RSWDet, based on neural networks. RSWDet comprises two innovative key modules. The first is a dual-branch structured point set detection head, which, through training, adapts to the unique features of wind turbines, enabling accurate detection in large-scale complex backgrounds. The second is the Low-level Feature Enhancement module, which compensates for the loss of wind turbine feature information during sampling by leveraging rich low-level feature information. Experimental verification of RSWDet was conducted on datasets and real-world scenes. The results demonstrate that RSWDet exhibits significant advantages compared to other algorithms, achieving the highest average accuracy of 83.1%, Precision of 97.8%, and Recall of 99% on the validation set. In the actual multiscene GF2 remote sensing image test, with a threshold of 0.4, the Precision can reach 85.3%, and the Recall can reach 89.9%.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10680199","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Remote Sensing Spectral Index Guided Bitemporal Residual Attention Network for Wildfire Burn Severity Mapping","authors":"Mingda Wu;Qunying Huang;Tang Sui;Bo Peng;Manzhu Yu","doi":"10.1109/JSTARS.2024.3460531","DOIUrl":"10.1109/JSTARS.2024.3460531","url":null,"abstract":"Wildfires cause substantial damage and present considerable risks to both natural ecosystem and human societies. A precise and prompt evaluation of wildfire-induced damage is crucial for effective postfire management and restoration. Considerable advancements have been made in monitoring and mapping fire-affected areas through feature engineering and machine learning techniques. However, existing methods often exhibit several limitations, such as complicated and time-intensive procedures on manual labeling, and a primary focus on binary classification, which only distinguishes between burned and nonburned areas. In response, this study develops a wildfire burn severity assessment model, BiRAUnet-NBR, which can not only accurately identify fire-affected areas, but also assess the burn severity levels (low, moderate, and high) within those areas. Built upon the standard U-Net architecture, the proposed BiRAUnet-NBR first incorporates bitemporal Sentinel 2 Level-2A remote sensing imagery, captured before and after a wildfire, which enables the model to better distinguish burned areas from the background and identify the severity level of the resulting burns. In addition, it further enhances the standard U-Net architecture by fusing additional spectral layers, such as the normalized burn ratio (NBR) derived from post- and prefire images, therefore, informing the detection of burn areas. Moreover, BiRAUnet-NBR also integrates attention mechanism, enabling the model to pay more attention to meaningful features and burn areas, and residual blocks in the decoder module, which not only significantly improves segmentation results but also enhances training stability and prevents the issue of vanishing gradients. The experimental results demonstrate the superiority of the proposed model in both multiclass and binary mapping of wildfire burn areas, achieving an overall accuracy over 95%. Furthermore, it outperforms baseline algorithms, including support vector machine, random forest, eXtreme gradient boosting, and fully convolutional network, with an average improvement of 18% in F1-score and 15% in mean intersection over union.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10680302","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142248848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on Improved VGG-16 Model Based on Transfer Learning for Acoustic Image Recognition of Underwater Search and Rescue Targets","authors":"Xu Liu, Hanhao Zhu, Weihua Song, Jiahui Wang, Lengleng Yan, Kelin Wang","doi":"10.1109/jstars.2024.3459928","DOIUrl":"https://doi.org/10.1109/jstars.2024.3459928","url":null,"abstract":"","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GVANet: A Grouped Multiview Aggregation Network for Remote Sensing Image Segmentation","authors":"Yunsong Yang;Jinjiang Li;Zheng Chen;Lu Ren","doi":"10.1109/JSTARS.2024.3459958","DOIUrl":"10.1109/JSTARS.2024.3459958","url":null,"abstract":"In remote sensing image segmentation tasks, various challenges arise, including difficulties in recognizing objects due to differences in perspective, difficulty in distinguishing objects with similar colors, and challenges in segmentation caused by occlusions. To address these issues, we propose a method called the grouped multiview aggregation network (GVANet), which leverages multiview information for image analysis. This approach enables global multiview expansion and fine-grained cross-layer information interaction within the network. Within this network framework, to better utilize a wider range of multiview information to tackle challenges in remote sensing segmentation, we introduce the multiview feature aggregation block for extracting multiview information. Furthermore, to overcome the limitations of same-level shortcuts when dealing with multiview problems, we propose the channel group fusion block for cross-layer feature information interaction through a grouped fusion approach. Finally, to enhance the utilization of global features during the feature reconstruction phase, we introduce the aggregation-inhibition-activation block for feature selection and focus, which captures the key features for segmentation. Comprehensive experimental results on the Vaihingen and Potsdam datasets demonstrate that GVANet outperforms current state-of-the-art methods, achieving mIoU scores of 84.5% and 87.6%, respectively.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10679608","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Tang;Yanqiu Xing;Jiaqi Wang;Hong Yang;Dejun Wang;Yuanxin Li;Aiting Zhang
{"title":"A Multilevel Autoadaptive Denoising Algorithm Based on Forested Terrain Slope for ICESat-2 Photon-Counting Data","authors":"Jie Tang;Yanqiu Xing;Jiaqi Wang;Hong Yang;Dejun Wang;Yuanxin Li;Aiting Zhang","doi":"10.1109/JSTARS.2024.3459957","DOIUrl":"10.1109/JSTARS.2024.3459957","url":null,"abstract":"In complex mountainous terrain, the terrain slope causes the scattering of pulsed lasers, generating a lot of noise in photon cloud data (PCD) collected from forestland, which seriously affects the accurate retrieval of forest structure parameters. To address this problem, a multilevel autoadaptive denoising (MLAD) algorithm was proposed in this article. First, random noise photons were removed through the ordering points to identify the clustering structure (OPTICS) algorithm in the coarse denoising process. Second, in the fine denoising step, the circular search domain in the OPTICS algorithm was replaced with an elliptical search domain. The photons after coarse denoising were automatically divided along-track direction into several continuous segments of 100 m each. The median slope method was used to automatically calculate the slope of the forested terrain in each interval segment, so that the range of the ellipse search domain was automatically adjusted to achieve accurate denoising of PCD. Finally, the denoising results of the MLAD algorithm in three different forested terrain areas were compared with those of the difference, regression, and Gaussian adaptive nearest neighbor (DRAGANN) algorithms, and the performance of the MLAD algorithm was evaluated for both different terrain slopes and different vegetation coverages. The results indicated that compared with the DRAGANN algorithm, the MLAD algorithm has higher denoising capability in different regions. The denoising results of the MLAD algorithm exhibit slight changes with the variation in slope, and the \u0000<italic>F</i>\u0000-values are around 0.96, demonstrating good robustness. The \u0000<italic>F</i>\u0000-value of the MLAD algorithm mostly exceeds 0.95 in different vegetation coverages. Overall, the MLAD algorithm exhibits stronger noise identification capabilities for complex forest environments. These results can provide a reference for subsequent accurate extraction of forest structural parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10679614","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Background Debiased SAR Automatic Target Recognition via a Novel Causal Interventional Regularizer","authors":"Hongwei Dong;Fangzhou Han;Lingyu Si;Wenwen Qiang;Ruiheng Zhang;Lamei Zhang","doi":"10.1109/JSTARS.2024.3459869","DOIUrl":"10.1109/JSTARS.2024.3459869","url":null,"abstract":"Recent studies have utilized deep learning (DL) techniques to automatically extract features from synthetic aperture radar (SAR) images, which shows great promise for enhancing the performance of SAR automatic target recognition (ATR). However, our research reveals a previously overlooked issue: SAR images to be recognized include not only the foreground (i.e., the target), but also a certain size of the background area. When a DL-model is trained exclusively on foreground data, its recognition performance is significantly superior to a model trained on original data that includes both foreground and background. This suggests that the presence of background impedes the ability of the DL-model to learn additional semantic information about the target. To address this issue, we construct a structural causal model (SCM) that incorporates the background as a confounder. Based on the constructed SCM, we propose a causal intervention-based regularization method to eliminate the negative impact of background on feature semantic learning and achieve background debiased SAR-ATR. The proposed causal interventional regularizer can be integrated into any existing DL-based SAR-ATR models, mitigating the impact of background interference on the feature extraction and recognition accuracy without affecting the testing speed of these models. Experimental results on the moving and stationary target acquisition and recognition and SAR-AIRcraft-1.0 datasets indicate that the proposed method can enhance the efficiency of existing DL-based methods in a plug-and-play manner.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10679516","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haozheng Ma;Xiaohong Yang;Runyu Fan;Wei Han;Kang He;Lizhe Wang
{"title":"Refined Water-Body Types Mapping Using a Water-Scene Enhancement Deep Models by Fusing Optical and SAR Images","authors":"Haozheng Ma;Xiaohong Yang;Runyu Fan;Wei Han;Kang He;Lizhe Wang","doi":"10.1109/JSTARS.2024.3459916","DOIUrl":"10.1109/JSTARS.2024.3459916","url":null,"abstract":"Water is an important element in the ecological environment, and different types of water (e.g., rivers, lakes, and ponds) have different impacts on the ecology. The extraction and classification of different types of water bodies has significant implications for the water resource management and water environment monitoring. Current research on the water-body types classification is relatively limited compared to water body extraction. Existing methods typically adopt a two-stage architecture, where the first stage extracts water bodies at the pixel level, and the second stage classifies the water bodies into different types using rule-based thresholds classifier and morphological features at object level. However, methods in the second stage suffer from overfitting, lack of robustness, and confusion in object segmentation. Despite these challenges, the deep learning methods could capture the high-level semantic features, which are effective for the classification of different types of water bodies. In this article, a novel water-scene enhancement deep model (WSEDM) was proposed for identifying multiple types of water bodies. The WSEDM consists of a pixel-wise water body extraction using Edge-Otsu and a patch-wise water-body types classification through deep learning model. In order to improve the accuracy of patch-wise water body classification, a novel multimodal feature fusion network (CASANet) was designed for the fusing of optical and synthetic aperture radar images. The water-body types classification was conducted on three international wetland cities in the urban agglomeration in the middle reaches of the Yangtze River. The 10-m water-body types map achieved an overall accuracy of 94.6%. The proposed CASANet is also validated through comparison and transferability experiments, which further confirmed the superior performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10679619","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}