{"title":"基于焦点损失的深度神经网络交通标志检测评价","authors":"Deepika Kamboj, Sharda Vashisth, Sumeet Saurav","doi":"10.1080/19479832.2022.2086304","DOIUrl":null,"url":null,"abstract":"ABSTRACT With advancements in autonomous driving, demand for stringent and computationally efficient traffic sign detection systems has increased. However, bringing such a system to a deployable level requires handling critical accuracy and processing speed issues. A focal loss-based single-stage object detector, i.e RetinaNet, is used as a trade-off between accuracy and processing speed as it handles the class imbalance problem of the single-stage detector and is thus suitable for traffic sign detection (TSD). We assessed the detector’s performance by combining various feature extractors such as ResNet-50, ResNet-101, and ResNet-152 on three publicly available TSD benchmark datasets. Performance comparison of the detector using different backbone includes evaluation parameters like mean average precision (mAP), memory allocation, running time, and floating-point operations. From the evaluation results, we found that the RetinaNet object detector using the ResNet-152 backbone obtains the best mAP, while that using ResNet-101 strikes the best trade-off between accuracy and execution time. The motivation behind benchmarking the detector on different datasets is to analyse the detector’s performance on different TSD benchmark datasets. Among the three feature extractors, the RetinaNet model trained using the ResNet-50 backbone is an excellent model in memory consumption, making it an optimal choice for low-cost embedded devices deployment.","PeriodicalId":46012,"journal":{"name":"International Journal of Image and Data Fusion","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluation of focal loss based deep neural networks for traffic sign detection\",\"authors\":\"Deepika Kamboj, Sharda Vashisth, Sumeet Saurav\",\"doi\":\"10.1080/19479832.2022.2086304\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT With advancements in autonomous driving, demand for stringent and computationally efficient traffic sign detection systems has increased. However, bringing such a system to a deployable level requires handling critical accuracy and processing speed issues. A focal loss-based single-stage object detector, i.e RetinaNet, is used as a trade-off between accuracy and processing speed as it handles the class imbalance problem of the single-stage detector and is thus suitable for traffic sign detection (TSD). We assessed the detector’s performance by combining various feature extractors such as ResNet-50, ResNet-101, and ResNet-152 on three publicly available TSD benchmark datasets. Performance comparison of the detector using different backbone includes evaluation parameters like mean average precision (mAP), memory allocation, running time, and floating-point operations. From the evaluation results, we found that the RetinaNet object detector using the ResNet-152 backbone obtains the best mAP, while that using ResNet-101 strikes the best trade-off between accuracy and execution time. The motivation behind benchmarking the detector on different datasets is to analyse the detector’s performance on different TSD benchmark datasets. Among the three feature extractors, the RetinaNet model trained using the ResNet-50 backbone is an excellent model in memory consumption, making it an optimal choice for low-cost embedded devices deployment.\",\"PeriodicalId\":46012,\"journal\":{\"name\":\"International Journal of Image and Data Fusion\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2022-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Image and Data Fusion\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/19479832.2022.2086304\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Image and Data Fusion","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/19479832.2022.2086304","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"REMOTE SENSING","Score":null,"Total":0}
Evaluation of focal loss based deep neural networks for traffic sign detection
ABSTRACT With advancements in autonomous driving, demand for stringent and computationally efficient traffic sign detection systems has increased. However, bringing such a system to a deployable level requires handling critical accuracy and processing speed issues. A focal loss-based single-stage object detector, i.e RetinaNet, is used as a trade-off between accuracy and processing speed as it handles the class imbalance problem of the single-stage detector and is thus suitable for traffic sign detection (TSD). We assessed the detector’s performance by combining various feature extractors such as ResNet-50, ResNet-101, and ResNet-152 on three publicly available TSD benchmark datasets. Performance comparison of the detector using different backbone includes evaluation parameters like mean average precision (mAP), memory allocation, running time, and floating-point operations. From the evaluation results, we found that the RetinaNet object detector using the ResNet-152 backbone obtains the best mAP, while that using ResNet-101 strikes the best trade-off between accuracy and execution time. The motivation behind benchmarking the detector on different datasets is to analyse the detector’s performance on different TSD benchmark datasets. Among the three feature extractors, the RetinaNet model trained using the ResNet-50 backbone is an excellent model in memory consumption, making it an optimal choice for low-cost embedded devices deployment.
期刊介绍:
International Journal of Image and Data Fusion provides a single source of information for all aspects of image and data fusion methodologies, developments, techniques and applications. Image and data fusion techniques are important for combining the many sources of satellite, airborne and ground based imaging systems, and integrating these with other related data sets for enhanced information extraction and decision making. Image and data fusion aims at the integration of multi-sensor, multi-temporal, multi-resolution and multi-platform image data, together with geospatial data, GIS, in-situ, and other statistical data sets for improved information extraction, as well as to increase the reliability of the information. This leads to more accurate information that provides for robust operational performance, i.e. increased confidence, reduced ambiguity and improved classification enabling evidence based management. The journal welcomes original research papers, review papers, shorter letters, technical articles, book reviews and conference reports in all areas of image and data fusion including, but not limited to, the following aspects and topics: • Automatic registration/geometric aspects of fusing images with different spatial, spectral, temporal resolutions; phase information; or acquired in different modes • Pixel, feature and decision level fusion algorithms and methodologies • Data Assimilation: fusing data with models • Multi-source classification and information extraction • Integration of satellite, airborne and terrestrial sensor systems • Fusing temporal data sets for change detection studies (e.g. for Land Cover/Land Use Change studies) • Image and data mining from multi-platform, multi-source, multi-scale, multi-temporal data sets (e.g. geometric information, topological information, statistical information, etc.).