{"title":"基于多模态遥感影像的滑坡实时监测轻量级双流关注网络","authors":"Pooja Dhayal , Pradeep Singh , Kanishk Sharma , Samarpita Sarkar , Dhani Ram Rajak , Alok Bhardwaj , Balasubramanian Raman","doi":"10.1016/j.rsase.2025.101732","DOIUrl":null,"url":null,"abstract":"<div><div>Rapid delineation of landslides from post-event remote-sensing imagery demands models that can be <em>deployed in the field</em>, often on battery-powered edge devices with strict memory and latency budgets. Most state-of-the-art detectors break those constraints, shipping tens of millions of parameters to gain marginal accuracy. We therefore introduce a <em>Dual-Stream Attention Network</em> whose total trainable footprint is just <em>2.03</em> <!-->M parameters – roughly the size of a single JPEG image – yet still exploits the complementary physics of co-registered RGB orthomosaics and digital-elevation models. Two light encoders process the modalities independently; a provably isometric late-fusion operator merges their hierarchies without information loss, and a <span><math><mrow><mo>(</mo><mo>≤</mo><mn>1</mn><mo>)</mo></mrow></math></span>-Lipschitz spatial-channel gate discards redundant features while preserving gradient stability. Despite its size, the network attains an IoU of 0.835 and an mIoU of 0.818 on the high-resolution Bijie benchmark, coming within 4–5 percentage points of heavyweight baselines such as DeepLabv3<sup>\\protect \\relax \\special {t4ht=+}</sup>(R-101) while using <span><math><mrow><mo>≈</mo><mspace></mspace><mn>95</mn><mtext>%</mtext></mrow></math></span> fewer weights. Ablation studies show that each architectural choice – dual-stream processing, late fusion, and attentional gating – contributes at least <span><math><mrow><mo>+</mo><mn>1</mn><mo>.</mo><mn>5</mn></mrow></math></span> pp IoU. A single Jetson Xavier AGX segments a 256 × 256 tile in 1.6 s (<span><math><mrow><mo><</mo><mn>10</mn></mrow></math></span> W envelope), confirming real-time suitability for rapid landslide mapping missions. By reconciling DEM-derived information with extreme parameter efficiency, the proposed architecture offers a practical foundation for next-generation, on-device geohazard monitoring systems.</div></div>","PeriodicalId":53227,"journal":{"name":"Remote Sensing Applications-Society and Environment","volume":"40 ","pages":"Article 101732"},"PeriodicalIF":4.5000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight Dual-Stream Attention Network for real-time landslide monitoring in multi-modal remote sensing imagery\",\"authors\":\"Pooja Dhayal , Pradeep Singh , Kanishk Sharma , Samarpita Sarkar , Dhani Ram Rajak , Alok Bhardwaj , Balasubramanian Raman\",\"doi\":\"10.1016/j.rsase.2025.101732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Rapid delineation of landslides from post-event remote-sensing imagery demands models that can be <em>deployed in the field</em>, often on battery-powered edge devices with strict memory and latency budgets. Most state-of-the-art detectors break those constraints, shipping tens of millions of parameters to gain marginal accuracy. We therefore introduce a <em>Dual-Stream Attention Network</em> whose total trainable footprint is just <em>2.03</em> <!-->M parameters – roughly the size of a single JPEG image – yet still exploits the complementary physics of co-registered RGB orthomosaics and digital-elevation models. Two light encoders process the modalities independently; a provably isometric late-fusion operator merges their hierarchies without information loss, and a <span><math><mrow><mo>(</mo><mo>≤</mo><mn>1</mn><mo>)</mo></mrow></math></span>-Lipschitz spatial-channel gate discards redundant features while preserving gradient stability. Despite its size, the network attains an IoU of 0.835 and an mIoU of 0.818 on the high-resolution Bijie benchmark, coming within 4–5 percentage points of heavyweight baselines such as DeepLabv3<sup>\\\\protect \\\\relax \\\\special {t4ht=+}</sup>(R-101) while using <span><math><mrow><mo>≈</mo><mspace></mspace><mn>95</mn><mtext>%</mtext></mrow></math></span> fewer weights. Ablation studies show that each architectural choice – dual-stream processing, late fusion, and attentional gating – contributes at least <span><math><mrow><mo>+</mo><mn>1</mn><mo>.</mo><mn>5</mn></mrow></math></span> pp IoU. A single Jetson Xavier AGX segments a 256 × 256 tile in 1.6 s (<span><math><mrow><mo><</mo><mn>10</mn></mrow></math></span> W envelope), confirming real-time suitability for rapid landslide mapping missions. By reconciling DEM-derived information with extreme parameter efficiency, the proposed architecture offers a practical foundation for next-generation, on-device geohazard monitoring systems.</div></div>\",\"PeriodicalId\":53227,\"journal\":{\"name\":\"Remote Sensing Applications-Society and Environment\",\"volume\":\"40 \",\"pages\":\"Article 101732\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Remote Sensing Applications-Society and Environment\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S235293852500285X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing Applications-Society and Environment","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S235293852500285X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
A lightweight Dual-Stream Attention Network for real-time landslide monitoring in multi-modal remote sensing imagery
Rapid delineation of landslides from post-event remote-sensing imagery demands models that can be deployed in the field, often on battery-powered edge devices with strict memory and latency budgets. Most state-of-the-art detectors break those constraints, shipping tens of millions of parameters to gain marginal accuracy. We therefore introduce a Dual-Stream Attention Network whose total trainable footprint is just 2.03 M parameters – roughly the size of a single JPEG image – yet still exploits the complementary physics of co-registered RGB orthomosaics and digital-elevation models. Two light encoders process the modalities independently; a provably isometric late-fusion operator merges their hierarchies without information loss, and a -Lipschitz spatial-channel gate discards redundant features while preserving gradient stability. Despite its size, the network attains an IoU of 0.835 and an mIoU of 0.818 on the high-resolution Bijie benchmark, coming within 4–5 percentage points of heavyweight baselines such as DeepLabv3\protect \relax \special {t4ht=+}(R-101) while using fewer weights. Ablation studies show that each architectural choice – dual-stream processing, late fusion, and attentional gating – contributes at least pp IoU. A single Jetson Xavier AGX segments a 256 × 256 tile in 1.6 s ( W envelope), confirming real-time suitability for rapid landslide mapping missions. By reconciling DEM-derived information with extreme parameter efficiency, the proposed architecture offers a practical foundation for next-generation, on-device geohazard monitoring systems.
期刊介绍:
The journal ''Remote Sensing Applications: Society and Environment'' (RSASE) focuses on remote sensing studies that address specific topics with an emphasis on environmental and societal issues - regional / local studies with global significance. Subjects are encouraged to have an interdisciplinary approach and include, but are not limited by: " -Global and climate change studies addressing the impact of increasing concentrations of greenhouse gases, CO2 emission, carbon balance and carbon mitigation, energy system on social and environmental systems -Ecological and environmental issues including biodiversity, ecosystem dynamics, land degradation, atmospheric and water pollution, urban footprint, ecosystem management and natural hazards (e.g. earthquakes, typhoons, floods, landslides) -Natural resource studies including land-use in general, biomass estimation, forests, agricultural land, plantation, soils, coral reefs, wetland and water resources -Agriculture, food production systems and food security outcomes -Socio-economic issues including urban systems, urban growth, public health, epidemics, land-use transition and land use conflicts -Oceanography and coastal zone studies, including sea level rise projections, coastlines changes and the ocean-land interface -Regional challenges for remote sensing application techniques, monitoring and analysis, such as cloud screening and atmospheric correction for tropical regions -Interdisciplinary studies combining remote sensing, household survey data, field measurements and models to address environmental, societal and sustainability issues -Quantitative and qualitative analysis that documents the impact of using remote sensing studies in social, political, environmental or economic systems