Pengfei Tang , Jocelyn Chanussot , Shanchuan Guo , Wei Zhang , Lu Qie , Peng Zhang , Hong Fang , Peijun Du
{"title":"利用多尺度时空混合结构进行深度学习,绘制稳健的作物分布图","authors":"Pengfei Tang , Jocelyn Chanussot , Shanchuan Guo , Wei Zhang , Lu Qie , Peng Zhang , Hong Fang , Peijun Du","doi":"10.1016/j.isprsjprs.2024.01.025","DOIUrl":null,"url":null,"abstract":"<div><p>Large-scale crop mapping from dense time-series images is a difficult task and becomes even more challenging with the cloud coverage. Current deep learning models frequently represent time series from a single perspective, which is insufficient to obtain fine-grained details. Meanwhile, the impact of cloud noise on deep learning models is not yet fully understood. In this study, a Multi-scale Temporal Transformer-Conv network (Ms-TTC) is proposed for robust crop mapping under frequently clouds. The Ms-TTC enhances temporal representations by effectively combining the global modeling capability of self-attention with the local capture capability of convolutional neural network (CNN) at multi-temporal scales. The Ms-TTC network consists of three main components: (1) a temporal encoder module that explores global and local temporal relationships at multi-temporal scales, (2) an attention-based fusion module that effectively fuses multi-scale temporal features, and (3) the output module that concatenates the high-level time series features and refined multi-scale features to predict the label. The proposed model demonstrated superior performance compared to state-of-the-art methods on the large-scale time series dataset, FranceCrops, achieving a minimum improvement of 2% in mF1 scores. Subsequently, gradient back-propagation-based feature importance analysis was used to investigate the behavior of deep learning models for processing time series data with cloud noise. The results revealed that most deep learning models can suppress cloudy observations to some degree, and models with a global field of view had superior cloud masking but also lost some local temporal information. Clouds can influence the model's attention towards the spectral dimension, particularly affecting the visible and vegetation red-edge bands, which exhibit higher sensitivity to cloud noise and play a crucial role to performance. This study provides a feasible approach for large-scale dynamic crop mapping independently of cloudy conditions by combining global-local temporal representations at multi-scales.</p></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"209 ","pages":"Pages 117-132"},"PeriodicalIF":12.2000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning with multi-scale temporal hybrid structure for robust crop mapping\",\"authors\":\"Pengfei Tang , Jocelyn Chanussot , Shanchuan Guo , Wei Zhang , Lu Qie , Peng Zhang , Hong Fang , Peijun Du\",\"doi\":\"10.1016/j.isprsjprs.2024.01.025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Large-scale crop mapping from dense time-series images is a difficult task and becomes even more challenging with the cloud coverage. Current deep learning models frequently represent time series from a single perspective, which is insufficient to obtain fine-grained details. Meanwhile, the impact of cloud noise on deep learning models is not yet fully understood. In this study, a Multi-scale Temporal Transformer-Conv network (Ms-TTC) is proposed for robust crop mapping under frequently clouds. The Ms-TTC enhances temporal representations by effectively combining the global modeling capability of self-attention with the local capture capability of convolutional neural network (CNN) at multi-temporal scales. The Ms-TTC network consists of three main components: (1) a temporal encoder module that explores global and local temporal relationships at multi-temporal scales, (2) an attention-based fusion module that effectively fuses multi-scale temporal features, and (3) the output module that concatenates the high-level time series features and refined multi-scale features to predict the label. The proposed model demonstrated superior performance compared to state-of-the-art methods on the large-scale time series dataset, FranceCrops, achieving a minimum improvement of 2% in mF1 scores. Subsequently, gradient back-propagation-based feature importance analysis was used to investigate the behavior of deep learning models for processing time series data with cloud noise. The results revealed that most deep learning models can suppress cloudy observations to some degree, and models with a global field of view had superior cloud masking but also lost some local temporal information. Clouds can influence the model's attention towards the spectral dimension, particularly affecting the visible and vegetation red-edge bands, which exhibit higher sensitivity to cloud noise and play a crucial role to performance. This study provides a feasible approach for large-scale dynamic crop mapping independently of cloudy conditions by combining global-local temporal representations at multi-scales.</p></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"209 \",\"pages\":\"Pages 117-132\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271624000340\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271624000340","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
Deep learning with multi-scale temporal hybrid structure for robust crop mapping
Large-scale crop mapping from dense time-series images is a difficult task and becomes even more challenging with the cloud coverage. Current deep learning models frequently represent time series from a single perspective, which is insufficient to obtain fine-grained details. Meanwhile, the impact of cloud noise on deep learning models is not yet fully understood. In this study, a Multi-scale Temporal Transformer-Conv network (Ms-TTC) is proposed for robust crop mapping under frequently clouds. The Ms-TTC enhances temporal representations by effectively combining the global modeling capability of self-attention with the local capture capability of convolutional neural network (CNN) at multi-temporal scales. The Ms-TTC network consists of three main components: (1) a temporal encoder module that explores global and local temporal relationships at multi-temporal scales, (2) an attention-based fusion module that effectively fuses multi-scale temporal features, and (3) the output module that concatenates the high-level time series features and refined multi-scale features to predict the label. The proposed model demonstrated superior performance compared to state-of-the-art methods on the large-scale time series dataset, FranceCrops, achieving a minimum improvement of 2% in mF1 scores. Subsequently, gradient back-propagation-based feature importance analysis was used to investigate the behavior of deep learning models for processing time series data with cloud noise. The results revealed that most deep learning models can suppress cloudy observations to some degree, and models with a global field of view had superior cloud masking but also lost some local temporal information. Clouds can influence the model's attention towards the spectral dimension, particularly affecting the visible and vegetation red-edge bands, which exhibit higher sensitivity to cloud noise and play a crucial role to performance. This study provides a feasible approach for large-scale dynamic crop mapping independently of cloudy conditions by combining global-local temporal representations at multi-scales.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.