{"title":"LMFE-RDD:采用轻量级多特征提取网络的道路损坏检测器","authors":"Qihan He, Zhongxu Li, Wenyuan Yang","doi":"10.1007/s00530-024-01367-z","DOIUrl":null,"url":null,"abstract":"<p>Road damage detection using computer vision and deep learning to automatically identify all kinds of road damage is an efficient application in object detection, which can significantly improve the efficiency of road maintenance planning and repair work and ensure road safety. However, due to the complexity of target recognition, the existing road damage detection models usually carry a large number of parameters and a large amount of computation, resulting in a slow inference speed, which limits the actual deployment of the model on the equipment with limited computing resources to a certain extent. In this study, we propose a road damage detector named LMFE-RDD for balancing speed and accuracy, which constructs a Lightweight Multi-Feature Extraction Network (LMFE-Net) as the backbone network and an Efficient Semantic Fusion Network (ESF-Net) for multi-scale feature fusion. First, as the backbone feature extraction network, LMFE-Net inputs road damage images to obtain three different scale feature maps. Second, ESF-Net fuses these three feature graphs and outputs three fusion features. Finally, the detection head is sent for target identification and positioning, and the final result is obtained. In addition, we use WDB loss, a multi-task loss function with a non-monotonic dynamic focusing mechanism, to pay more attention to bounding box regression losses. The experimental results show that the proposed LMFE-RDD model has competitive accuracy while ensuring speed. In the Multi-Perspective Road Damage Dataset, combining the data from all perspectives, LMFE-RDD achieves the detection speed of 51.0 FPS and 64.2% mAP@0.5, but the parameters are only 13.5 M.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"36 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LMFE-RDD: a road damage detector with a lightweight multi-feature extraction network\",\"authors\":\"Qihan He, Zhongxu Li, Wenyuan Yang\",\"doi\":\"10.1007/s00530-024-01367-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Road damage detection using computer vision and deep learning to automatically identify all kinds of road damage is an efficient application in object detection, which can significantly improve the efficiency of road maintenance planning and repair work and ensure road safety. However, due to the complexity of target recognition, the existing road damage detection models usually carry a large number of parameters and a large amount of computation, resulting in a slow inference speed, which limits the actual deployment of the model on the equipment with limited computing resources to a certain extent. In this study, we propose a road damage detector named LMFE-RDD for balancing speed and accuracy, which constructs a Lightweight Multi-Feature Extraction Network (LMFE-Net) as the backbone network and an Efficient Semantic Fusion Network (ESF-Net) for multi-scale feature fusion. First, as the backbone feature extraction network, LMFE-Net inputs road damage images to obtain three different scale feature maps. Second, ESF-Net fuses these three feature graphs and outputs three fusion features. Finally, the detection head is sent for target identification and positioning, and the final result is obtained. In addition, we use WDB loss, a multi-task loss function with a non-monotonic dynamic focusing mechanism, to pay more attention to bounding box regression losses. The experimental results show that the proposed LMFE-RDD model has competitive accuracy while ensuring speed. In the Multi-Perspective Road Damage Dataset, combining the data from all perspectives, LMFE-RDD achieves the detection speed of 51.0 FPS and 64.2% mAP@0.5, but the parameters are only 13.5 M.</p>\",\"PeriodicalId\":51138,\"journal\":{\"name\":\"Multimedia Systems\",\"volume\":\"36 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01367-z\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01367-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
LMFE-RDD: a road damage detector with a lightweight multi-feature extraction network
Road damage detection using computer vision and deep learning to automatically identify all kinds of road damage is an efficient application in object detection, which can significantly improve the efficiency of road maintenance planning and repair work and ensure road safety. However, due to the complexity of target recognition, the existing road damage detection models usually carry a large number of parameters and a large amount of computation, resulting in a slow inference speed, which limits the actual deployment of the model on the equipment with limited computing resources to a certain extent. In this study, we propose a road damage detector named LMFE-RDD for balancing speed and accuracy, which constructs a Lightweight Multi-Feature Extraction Network (LMFE-Net) as the backbone network and an Efficient Semantic Fusion Network (ESF-Net) for multi-scale feature fusion. First, as the backbone feature extraction network, LMFE-Net inputs road damage images to obtain three different scale feature maps. Second, ESF-Net fuses these three feature graphs and outputs three fusion features. Finally, the detection head is sent for target identification and positioning, and the final result is obtained. In addition, we use WDB loss, a multi-task loss function with a non-monotonic dynamic focusing mechanism, to pay more attention to bounding box regression losses. The experimental results show that the proposed LMFE-RDD model has competitive accuracy while ensuring speed. In the Multi-Perspective Road Damage Dataset, combining the data from all perspectives, LMFE-RDD achieves the detection speed of 51.0 FPS and 64.2% mAP@0.5, but the parameters are only 13.5 M.
期刊介绍:
This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.