Swin-ResUNet: A Swin-Topology Module for Road Extraction from Remote Sensing Images

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2022-11-30 DOI:10.1109/DICTA56598.2022.10034582

{"title":"Swin-ResUNet: A Swin-Topology Module for Road Extraction from Remote Sensing Images","authors":"","doi":"10.1109/DICTA56598.2022.10034582","DOIUrl":null,"url":null,"abstract":"Road extraction from remote sensing images plays a crucial role in navigation, traffic management, urban construction, and other fields. With the development of deep learning in the field of computer vision, road extraction from remote sensing images using deep learning models has become a hot research topic. The convolution-based U-shaped road extraction models have some issues such as high extraction error rate and poor continuity on road topology. The Transformer-based road extraction methods also have issues such as low extraction accuracy and large GPU memory usage. In order to solve the above issues, we propose a Swin-ResUNet structure and use the new paradigm Swin Transformer to extract roads in remote sensing images. Specifically, we construct a Swin-Topology module by adding a Sobel layer based on residual connections to the Swin Transformer block. Based on the Swin-Topology module, we propose a Swin-ResUNet network structure in order to better capture the topology of roads. Experimental results show that the values of mIOU and mDC obtained on the Massachusetts dataset were 64.1% and 76.6% respectively, and the corresponding values on the DeepGlobe2018 dataset were 66.69% and 75.86% respectively. When the batch size is 8, the GPU memory usage with Swin-ResUNet is about 9 GB, which is significantly smaller than other Transformer-based networks. Compared with convolution-based U-shaped structures, the Swin-ResUNet can better capture the topology of roads in remote sensing images and improve road extraction accuracy. Compared with other Transformer-based road extraction methods, the Swin-ResUNet can improve the accuracy of road extraction and reduce GPU memory usage.","PeriodicalId":159377,"journal":{"name":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA56598.2022.10034582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Road extraction from remote sensing images plays a crucial role in navigation, traffic management, urban construction, and other fields. With the development of deep learning in the field of computer vision, road extraction from remote sensing images using deep learning models has become a hot research topic. The convolution-based U-shaped road extraction models have some issues such as high extraction error rate and poor continuity on road topology. The Transformer-based road extraction methods also have issues such as low extraction accuracy and large GPU memory usage. In order to solve the above issues, we propose a Swin-ResUNet structure and use the new paradigm Swin Transformer to extract roads in remote sensing images. Specifically, we construct a Swin-Topology module by adding a Sobel layer based on residual connections to the Swin Transformer block. Based on the Swin-Topology module, we propose a Swin-ResUNet network structure in order to better capture the topology of roads. Experimental results show that the values of mIOU and mDC obtained on the Massachusetts dataset were 64.1% and 76.6% respectively, and the corresponding values on the DeepGlobe2018 dataset were 66.69% and 75.86% respectively. When the batch size is 8, the GPU memory usage with Swin-ResUNet is about 9 GB, which is significantly smaller than other Transformer-based networks. Compared with convolution-based U-shaped structures, the Swin-ResUNet can better capture the topology of roads in remote sensing images and improve road extraction accuracy. Compared with other Transformer-based road extraction methods, the Swin-ResUNet can improve the accuracy of road extraction and reduce GPU memory usage.

查看原文本刊更多论文

swwin - resunet:用于遥感图像道路提取的swwin - topology模块

遥感影像道路提取在导航、交通管理、城市建设等领域发挥着至关重要的作用。随着深度学习在计算机视觉领域的发展，利用深度学习模型从遥感图像中提取道路已成为一个研究热点。基于卷积的u型道路提取模型存在提取错误率高、道路拓扑连续性差等问题。基于transformer的道路提取方法也存在提取精度低和GPU内存占用大等问题。为了解决上述问题，我们提出了一种Swin- resunet结构，并使用新的Swin Transformer范式来提取遥感图像中的道路。具体来说，我们通过在Swin Transformer块上添加基于剩余连接的Sobel层来构建Swin- topology模块。在swwin - topology模块的基础上，我们提出了一种swwin - resunet网络结构，以便更好地捕获道路拓扑。实验结果表明，在Massachusetts数据集上得到的mIOU和mDC值分别为64.1%和76.6%，在DeepGlobe2018数据集上得到的mIOU和mDC值分别为66.69%和75.86%。当批处理大小为8时，swing - resunet的GPU内存使用量约为9 GB，这比其他基于transformer的网络要小得多。与基于卷积的u形结构相比，swwin - resunet可以更好地捕捉遥感图像中道路的拓扑结构，提高道路提取精度。与其他基于transformer的道路提取方法相比，swwin - resunet可以提高道路提取的准确性，减少GPU内存的使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量