{"title":"[使用改进的 RT-DETR 模型,基于多尺度特征融合的高效、轻量级皮肤病理学检测方法]。","authors":"Yuying Ren, Lingxiao Huang, Fang DU, Xinbo Yao","doi":"10.12122/j.issn.1673-4254.2025.02.22","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>The presence of multi-scale skin lesion regions and image noise interference and limited resources of auxiliary diagnostic equipment affect the accuracy of skin disease detection in skin disease detection tasks. To solve these problems, we propose a highly efficient and lightweight skin disease detection model using an improved RT-DETR model.</p><p><strong>Methods: </strong>A lightweight FasterNet was introduced as the backbone network and the FasterNetBlock module was parametrically refined. A Convolutional and Attention Fusion Module (CAFM) was used to replace the multi-head self-attention mechanism in the neck network to enhance the ability of the AIFI-CAFM module for capturing global dependencies and local detail information. The DRB-HSFPN feature pyramid network was designed to replace the Cross-Scale Feature Fusion Module (CCFM) to allow the integration of contextual information across different scales to improve the semantic feature expression capacity of the neck network. Finally, combining the advantages of Inner-IoU and EIoU, the Inner-EIoU was used to replace the original loss function GIOU to further enhance the model's inference accuracy and convergence speed.</p><p><strong>Results: </strong>The experimental results on the HAM10000 dataset showed that the improved RT-DETR model, as compared with the original model, had increased mAP@50 and mAP@50:95 by 4.5% and 2.8%, respectively, with a detection speed of 59.1 frames per second (FPS). The improved model had a parameter count of 10.9 M and a computational load of 19.3 GFLOPs, which were reduced by 46.0% and 67.2% compared to those of the original model, validating the effectiveness of the improved model.</p><p><strong>Conclusions: </strong>The proposed SD-DETR model significantly improves the performance of skin disease detection tasks by effectively extracting and integrating multi-scale features while reducing both parameter count and computational load.</p>","PeriodicalId":18962,"journal":{"name":"南方医科大学学报杂志","volume":"45 2","pages":"409-421"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11875869/pdf/","citationCount":"0","resultStr":"{\"title\":\"[An efficient and lightweight skin pathology detection method based on multi-scale feature fusion using an improved RT-DETR model].\",\"authors\":\"Yuying Ren, Lingxiao Huang, Fang DU, Xinbo Yao\",\"doi\":\"10.12122/j.issn.1673-4254.2025.02.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>The presence of multi-scale skin lesion regions and image noise interference and limited resources of auxiliary diagnostic equipment affect the accuracy of skin disease detection in skin disease detection tasks. To solve these problems, we propose a highly efficient and lightweight skin disease detection model using an improved RT-DETR model.</p><p><strong>Methods: </strong>A lightweight FasterNet was introduced as the backbone network and the FasterNetBlock module was parametrically refined. A Convolutional and Attention Fusion Module (CAFM) was used to replace the multi-head self-attention mechanism in the neck network to enhance the ability of the AIFI-CAFM module for capturing global dependencies and local detail information. The DRB-HSFPN feature pyramid network was designed to replace the Cross-Scale Feature Fusion Module (CCFM) to allow the integration of contextual information across different scales to improve the semantic feature expression capacity of the neck network. Finally, combining the advantages of Inner-IoU and EIoU, the Inner-EIoU was used to replace the original loss function GIOU to further enhance the model's inference accuracy and convergence speed.</p><p><strong>Results: </strong>The experimental results on the HAM10000 dataset showed that the improved RT-DETR model, as compared with the original model, had increased mAP@50 and mAP@50:95 by 4.5% and 2.8%, respectively, with a detection speed of 59.1 frames per second (FPS). The improved model had a parameter count of 10.9 M and a computational load of 19.3 GFLOPs, which were reduced by 46.0% and 67.2% compared to those of the original model, validating the effectiveness of the improved model.</p><p><strong>Conclusions: </strong>The proposed SD-DETR model significantly improves the performance of skin disease detection tasks by effectively extracting and integrating multi-scale features while reducing both parameter count and computational load.</p>\",\"PeriodicalId\":18962,\"journal\":{\"name\":\"南方医科大学学报杂志\",\"volume\":\"45 2\",\"pages\":\"409-421\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11875869/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"南方医科大学学报杂志\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12122/j.issn.1673-4254.2025.02.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"南方医科大学学报杂志","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12122/j.issn.1673-4254.2025.02.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
[An efficient and lightweight skin pathology detection method based on multi-scale feature fusion using an improved RT-DETR model].
Objectives: The presence of multi-scale skin lesion regions and image noise interference and limited resources of auxiliary diagnostic equipment affect the accuracy of skin disease detection in skin disease detection tasks. To solve these problems, we propose a highly efficient and lightweight skin disease detection model using an improved RT-DETR model.
Methods: A lightweight FasterNet was introduced as the backbone network and the FasterNetBlock module was parametrically refined. A Convolutional and Attention Fusion Module (CAFM) was used to replace the multi-head self-attention mechanism in the neck network to enhance the ability of the AIFI-CAFM module for capturing global dependencies and local detail information. The DRB-HSFPN feature pyramid network was designed to replace the Cross-Scale Feature Fusion Module (CCFM) to allow the integration of contextual information across different scales to improve the semantic feature expression capacity of the neck network. Finally, combining the advantages of Inner-IoU and EIoU, the Inner-EIoU was used to replace the original loss function GIOU to further enhance the model's inference accuracy and convergence speed.
Results: The experimental results on the HAM10000 dataset showed that the improved RT-DETR model, as compared with the original model, had increased mAP@50 and mAP@50:95 by 4.5% and 2.8%, respectively, with a detection speed of 59.1 frames per second (FPS). The improved model had a parameter count of 10.9 M and a computational load of 19.3 GFLOPs, which were reduced by 46.0% and 67.2% compared to those of the original model, validating the effectiveness of the improved model.
Conclusions: The proposed SD-DETR model significantly improves the performance of skin disease detection tasks by effectively extracting and integrating multi-scale features while reducing both parameter count and computational load.