A global Swin-Unet Sentinel-2 surface reflectance-based cloud and cloud shadow detection algorithm for the NASA Harmonized Landsat Sentinel-2 (HLS) dataset

IF 5.7 Q1 ENVIRONMENTAL SCIENCES

Science of Remote Sensing Pub Date : 2025-02-26 DOI:10.1016/j.srs.2025.100213

Haiyan Huang , David P. Roy , Hugo De Lemos , Yuean Qiu , Hankui K. Zhang

{"title":"A global Swin-Unet Sentinel-2 surface reflectance-based cloud and cloud shadow detection algorithm for the NASA Harmonized Landsat Sentinel-2 (HLS) dataset","authors":"Haiyan Huang , David P. Roy , Hugo De Lemos , Yuean Qiu , Hankui K. Zhang","doi":"10.1016/j.srs.2025.100213","DOIUrl":null,"url":null,"abstract":"<div><div>The NASA Harmonized Landsat Sentinel-2 (HLS) data provides global coverage atmospherically corrected surface reflectance with a 30m cloud and cloud shadow mask derived using the Fmask algorithm applied to top-of-atmosphere (TOA) reflectance. In this study we demonstrate, as have other researchers, low Sentinel-2 Fmask performance, and present a solution that applies a deep learning Swin-Unet model to the HLS surface reflectance to provide unambiguously improved cloud and cloud shadow detection. The model was trained and assessed using 30m HLS surface reflectance for the 13 Sentinel-2 bands and corresponding CloudSEN12+ annotations, that define cloud, thin cloud, clear, and cloud shadow, and is the largest publicly available expert annotation set. All the CloudSEN12 annotations with coincident HLS Sentinel-2 data were considered. A total of 8672 globally distributed 5 × 5 km data sets were used, 7362 to train the model, 464 for internal model validation, and 846 to independently assess the classification accuracy. The HLS Sentinel-2 Fmask had F1-scores of 0.832 (cloud), 0.546 (cloud shadow), and 0.873 (clear), and the Swin-Unet model had higher performance with F1-scores of 0.891 (cloud and thin cloud combined), 0.710 (cloud shadow), and 0.923 (clear) despite the use of surface and not TOA reflectance. The Swin-Unet thin cloud class had low accuracy (0.604 F1-score) likely due to atmospheric correction issues and thin cloud variability that are discussed. The comprehensively trained model provides a solution for users who wish to improve the HLS Sentinel-2 cloud and cloud shadow masking using the available HLS Sentinel-2 surface reflectance data.</div></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"11 ","pages":"Article 100213"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017225000197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

The NASA Harmonized Landsat Sentinel-2 (HLS) data provides global coverage atmospherically corrected surface reflectance with a 30m cloud and cloud shadow mask derived using the Fmask algorithm applied to top-of-atmosphere (TOA) reflectance. In this study we demonstrate, as have other researchers, low Sentinel-2 Fmask performance, and present a solution that applies a deep learning Swin-Unet model to the HLS surface reflectance to provide unambiguously improved cloud and cloud shadow detection. The model was trained and assessed using 30m HLS surface reflectance for the 13 Sentinel-2 bands and corresponding CloudSEN12+ annotations, that define cloud, thin cloud, clear, and cloud shadow, and is the largest publicly available expert annotation set. All the CloudSEN12 annotations with coincident HLS Sentinel-2 data were considered. A total of 8672 globally distributed 5 × 5 km data sets were used, 7362 to train the model, 464 for internal model validation, and 846 to independently assess the classification accuracy. The HLS Sentinel-2 Fmask had F1-scores of 0.832 (cloud), 0.546 (cloud shadow), and 0.873 (clear), and the Swin-Unet model had higher performance with F1-scores of 0.891 (cloud and thin cloud combined), 0.710 (cloud shadow), and 0.923 (clear) despite the use of surface and not TOA reflectance. The Swin-Unet thin cloud class had low accuracy (0.604 F1-score) likely due to atmospheric correction issues and thin cloud variability that are discussed. The comprehensively trained model provides a solution for users who wish to improve the HLS Sentinel-2 cloud and cloud shadow masking using the available HLS Sentinel-2 surface reflectance data.

查看原文本刊更多论文

NASA Harmonized Landsat Sentinel-2 （HLS）数据集基于全球swun - unet Sentinel-2表面反射率的云和云阴影检测算法

NASA Harmonized Landsat Sentinel-2 （HLS）数据提供了全球覆盖的大气校正表面反射率，其中包括30m云和云阴影掩模，该掩模采用了应用于大气顶（TOA）反射率的Fmask算法。在本研究中，我们和其他研究人员一样，证明了Sentinel-2的低Fmask性能，并提出了一种解决方案，该解决方案将深度学习swing - unet模型应用于HLS表面反射率，以提供明确改进的云和云阴影检测。该模型使用30m HLS表面反射率对13个Sentinel-2波段和相应的CloudSEN12+注释进行了训练和评估，这些注释定义了云、薄云、晴空和云影，是目前最大的公开专家注释集。所有与HLS Sentinel-2数据一致的CloudSEN12注释都被考虑在内。共使用8672个全球分布的5 × 5 km数据集，其中7362个用于模型训练，464个用于模型内部验证，846个用于独立评估分类精度。HLS Sentinel-2 Fmask的f1得分分别为0.832（云）、0.546（云阴影）和0.873（清晰），而swun - unet模型在使用表面而非TOA反射率的情况下，f1得分分别为0.891（云和薄云组合）、0.710（云阴影）和0.923（清晰），具有更高的性能。swan - unet薄云分类的精度较低（0.604 f1分），可能是由于讨论的大气校正问题和薄云变率。经过全面训练的模型为希望利用现有HLS Sentinel-2表面反射率数据改进HLS Sentinel-2云和云阴影掩蔽的用户提供了解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Science of Remote Sensing

CiteScore

12.20

自引率

0.00%

发文量