Jingchun Zhou;Zongxin He;Dehuan Zhang;Siyuan Liu;Xianping Fu;Xuelong Li
{"title":"水下目标检测的空间残差","authors":"Jingchun Zhou;Zongxin He;Dehuan Zhang;Siyuan Liu;Xianping Fu;Xuelong Li","doi":"10.1109/TPAMI.2025.3548652","DOIUrl":null,"url":null,"abstract":"Feature drift is caused by the dynamic coupling of target features and degradation factors, which reduce underwater detector performance. We redefine feature drift as the instability of target features within boundary constraints while solving partial differential equations (PDEs). From this insight, we propose the Spatial Residual (SR) block, which uses SkipCut to establish effective constraints across the network width for solving PDEs and optimizes the solution space. It is implemented as a general-purpose backbone with 5 Spatial Residuals (BSR5) for complex feature scenarios. Specifically, BSR5 extracts discrete channel slices through SkipCut, where each sliced feature is parsed within the appropriate data capacity. In gradient backpropagation, SkipCut functions as a ShortCut, optimizing information flow and gradient allocation to enhance performance and accelerate training. Experiments on the RUOD dataset show that BSR5-integrated DETRs and YOLOs achieve state-of-the-art results for conventional and end-to-end detectors. Specifically, our BSR5-DETR improves 1.3% and 2.7% AP than RT-DETR with ResNet-101, while reducing parameters by 41.6% and 6.6%, respectively. Further validation highlights BSR5's strong convergence and robustness, especially in training from scratch scenarios, making it well suited for data-scarce, resource-constrained, and real-time tasks.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 6","pages":"4996-5013"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spatial Residual for Underwater Object Detection\",\"authors\":\"Jingchun Zhou;Zongxin He;Dehuan Zhang;Siyuan Liu;Xianping Fu;Xuelong Li\",\"doi\":\"10.1109/TPAMI.2025.3548652\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature drift is caused by the dynamic coupling of target features and degradation factors, which reduce underwater detector performance. We redefine feature drift as the instability of target features within boundary constraints while solving partial differential equations (PDEs). From this insight, we propose the Spatial Residual (SR) block, which uses SkipCut to establish effective constraints across the network width for solving PDEs and optimizes the solution space. It is implemented as a general-purpose backbone with 5 Spatial Residuals (BSR5) for complex feature scenarios. Specifically, BSR5 extracts discrete channel slices through SkipCut, where each sliced feature is parsed within the appropriate data capacity. In gradient backpropagation, SkipCut functions as a ShortCut, optimizing information flow and gradient allocation to enhance performance and accelerate training. Experiments on the RUOD dataset show that BSR5-integrated DETRs and YOLOs achieve state-of-the-art results for conventional and end-to-end detectors. Specifically, our BSR5-DETR improves 1.3% and 2.7% AP than RT-DETR with ResNet-101, while reducing parameters by 41.6% and 6.6%, respectively. Further validation highlights BSR5's strong convergence and robustness, especially in training from scratch scenarios, making it well suited for data-scarce, resource-constrained, and real-time tasks.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"47 6\",\"pages\":\"4996-5013\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10916506/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10916506/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature drift is caused by the dynamic coupling of target features and degradation factors, which reduce underwater detector performance. We redefine feature drift as the instability of target features within boundary constraints while solving partial differential equations (PDEs). From this insight, we propose the Spatial Residual (SR) block, which uses SkipCut to establish effective constraints across the network width for solving PDEs and optimizes the solution space. It is implemented as a general-purpose backbone with 5 Spatial Residuals (BSR5) for complex feature scenarios. Specifically, BSR5 extracts discrete channel slices through SkipCut, where each sliced feature is parsed within the appropriate data capacity. In gradient backpropagation, SkipCut functions as a ShortCut, optimizing information flow and gradient allocation to enhance performance and accelerate training. Experiments on the RUOD dataset show that BSR5-integrated DETRs and YOLOs achieve state-of-the-art results for conventional and end-to-end detectors. Specifically, our BSR5-DETR improves 1.3% and 2.7% AP than RT-DETR with ResNet-101, while reducing parameters by 41.6% and 6.6%, respectively. Further validation highlights BSR5's strong convergence and robustness, especially in training from scratch scenarios, making it well suited for data-scarce, resource-constrained, and real-time tasks.