{"title":"SCTNet: A Shallow CNN–Transformer Network With Statistics-Driven Modules for Cloud Detection","authors":"Weixing Liu;Bin Luo;Jun Liu;Han Nie;Xin Su","doi":"10.1109/LGRS.2025.3561004","DOIUrl":null,"url":null,"abstract":"Existing cloud detection methods often rely on deep neural networks, leading to excessive computational overhead. To address this, we propose a shallow convolutional neural network (CNN)–Transformer hybrid architecture that limits the maximum downsampling rate to <inline-formula> <tex-math>$8\\times $ </tex-math></inline-formula>. This design preserves local details while effectively capturing global context through a lightweight Transformer branch. To enhance adaptability across diverse cloud scenes, we introduce two novel statistics-driven modules: statistics-adaptive convolution (SAC) and statistical mixing augmentation (SMA). SAC dynamically generates convolutional kernels based on input feature statistics, enabling adaptive feature extraction for varying cloud patterns. SMA improves model generalization by interpolating channel-wise statistics across training samples, increasing feature diversity. Experiments on four datasets show that the proposed method achieves state-of-the-art performance with 732 K parameters and 1G multiply-accumulate operations (MACs). Our code will be available at <uri>https://weix-liu.github.io/</uri> for further research.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10965745/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Existing cloud detection methods often rely on deep neural networks, leading to excessive computational overhead. To address this, we propose a shallow convolutional neural network (CNN)–Transformer hybrid architecture that limits the maximum downsampling rate to $8\times $ . This design preserves local details while effectively capturing global context through a lightweight Transformer branch. To enhance adaptability across diverse cloud scenes, we introduce two novel statistics-driven modules: statistics-adaptive convolution (SAC) and statistical mixing augmentation (SMA). SAC dynamically generates convolutional kernels based on input feature statistics, enabling adaptive feature extraction for varying cloud patterns. SMA improves model generalization by interpolating channel-wise statistics across training samples, increasing feature diversity. Experiments on four datasets show that the proposed method achieves state-of-the-art performance with 732 K parameters and 1G multiply-accumulate operations (MACs). Our code will be available at https://weix-liu.github.io/ for further research.