{"title":"DepthFormer:深度增强的变压器网络,用于从漫游车图像中提取火星表面的语义分割","authors":"Yuan Ma, Zhaojin Li, Bo Wu, Ran Duan","doi":"10.1029/2024EA003812","DOIUrl":null,"url":null,"abstract":"<p>The Martian surface, with its diverse landforms that reflect the planet's evolution, has attracted increasing scientific interest. While extensive data is needed for interpretation, identifying landform types is crucial. This semantic information reveals underlying features and patterns, offering valuable scientific insights. Advanced deep learning techniques, particularly Transformers, can enhance semantic segmentation and image interpretation, deepening our understanding of Martian surface features. However, current publicly available neural networks are trained in the context of Earth, rendering the direct use of the Martian surface impossible. Besides, the Martian surface features poorly texture and homogenous scenarios, leading to difficulty in segmenting the images into favorable semantic classes. In this paper, an innovative depth-enhanced Transformer network—DepthFormer is developed for the semantic segmentation of Martian surface images. The stereo images acquired by the Zhurong rover along its traverse are used for training and testing the DepthFormer network. Different from regular deep-learning networks only dealing with three bands (red, green and blue) of images, the DepthFormer incorporates the depth information available from the stereo images as the fourth band in the network to enable more accurate segmentation of various surface features. Experimental evaluations and comparisons using synthesized and actual Mars image data sets reveal that the DepthFormer achieves an average accuracy of 98%, superior to that of conventional segmentation methods. The proposed method is the first deep-learning model incorporating depth information for accurate semantic segmentation of the Martian surface, which is of significance for future Mars exploration missions and scientific studies.</p>","PeriodicalId":54286,"journal":{"name":"Earth and Space Science","volume":"12 6","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2024EA003812","citationCount":"0","resultStr":"{\"title\":\"DepthFormer: Depth-Enhanced Transformer Network for Semantic Segmentation of the Martian Surface From Rover Images\",\"authors\":\"Yuan Ma, Zhaojin Li, Bo Wu, Ran Duan\",\"doi\":\"10.1029/2024EA003812\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The Martian surface, with its diverse landforms that reflect the planet's evolution, has attracted increasing scientific interest. While extensive data is needed for interpretation, identifying landform types is crucial. This semantic information reveals underlying features and patterns, offering valuable scientific insights. Advanced deep learning techniques, particularly Transformers, can enhance semantic segmentation and image interpretation, deepening our understanding of Martian surface features. However, current publicly available neural networks are trained in the context of Earth, rendering the direct use of the Martian surface impossible. Besides, the Martian surface features poorly texture and homogenous scenarios, leading to difficulty in segmenting the images into favorable semantic classes. In this paper, an innovative depth-enhanced Transformer network—DepthFormer is developed for the semantic segmentation of Martian surface images. The stereo images acquired by the Zhurong rover along its traverse are used for training and testing the DepthFormer network. Different from regular deep-learning networks only dealing with three bands (red, green and blue) of images, the DepthFormer incorporates the depth information available from the stereo images as the fourth band in the network to enable more accurate segmentation of various surface features. Experimental evaluations and comparisons using synthesized and actual Mars image data sets reveal that the DepthFormer achieves an average accuracy of 98%, superior to that of conventional segmentation methods. The proposed method is the first deep-learning model incorporating depth information for accurate semantic segmentation of the Martian surface, which is of significance for future Mars exploration missions and scientific studies.</p>\",\"PeriodicalId\":54286,\"journal\":{\"name\":\"Earth and Space Science\",\"volume\":\"12 6\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2024EA003812\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth and Space Science\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1029/2024EA003812\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth and Space Science","FirstCategoryId":"89","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1029/2024EA003812","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
DepthFormer: Depth-Enhanced Transformer Network for Semantic Segmentation of the Martian Surface From Rover Images
The Martian surface, with its diverse landforms that reflect the planet's evolution, has attracted increasing scientific interest. While extensive data is needed for interpretation, identifying landform types is crucial. This semantic information reveals underlying features and patterns, offering valuable scientific insights. Advanced deep learning techniques, particularly Transformers, can enhance semantic segmentation and image interpretation, deepening our understanding of Martian surface features. However, current publicly available neural networks are trained in the context of Earth, rendering the direct use of the Martian surface impossible. Besides, the Martian surface features poorly texture and homogenous scenarios, leading to difficulty in segmenting the images into favorable semantic classes. In this paper, an innovative depth-enhanced Transformer network—DepthFormer is developed for the semantic segmentation of Martian surface images. The stereo images acquired by the Zhurong rover along its traverse are used for training and testing the DepthFormer network. Different from regular deep-learning networks only dealing with three bands (red, green and blue) of images, the DepthFormer incorporates the depth information available from the stereo images as the fourth band in the network to enable more accurate segmentation of various surface features. Experimental evaluations and comparisons using synthesized and actual Mars image data sets reveal that the DepthFormer achieves an average accuracy of 98%, superior to that of conventional segmentation methods. The proposed method is the first deep-learning model incorporating depth information for accurate semantic segmentation of the Martian surface, which is of significance for future Mars exploration missions and scientific studies.
期刊介绍:
Marking AGU’s second new open access journal in the last 12 months, Earth and Space Science is the only journal that reflects the expansive range of science represented by AGU’s 62,000 members, including all of the Earth, planetary, and space sciences, and related fields in environmental science, geoengineering, space engineering, and biogeochemistry.