基于深度学习的多幅路面破损图像的自动分类与检测

IF 7.4 2区工程技术 Q1 ENGINEERING, CIVIL

Journal of Traffic and Transportation Engineering-English Edition Pub Date : 2023-04-01 DOI:10.1016/j.jtte.2021.04.008

Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang

{"title":"基于深度学习的多幅路面破损图像的自动分类与检测","authors":"Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang","doi":"10.1016/j.jtte.2021.04.008","DOIUrl":null,"url":null,"abstract":"<div><p>To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and <em>F</em>2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, <em>F</em>2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.</p></div>","PeriodicalId":47239,"journal":{"name":"Journal of Traffic and Transportation Engineering-English Edition","volume":"10 2","pages":"Pages 276-290"},"PeriodicalIF":7.4000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automated classification and detection of multiple pavement distress images based on deep learning\",\"authors\":\"Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang\",\"doi\":\"10.1016/j.jtte.2021.04.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and <em>F</em>2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, <em>F</em>2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.</p></div>\",\"PeriodicalId\":47239,\"journal\":{\"name\":\"Journal of Traffic and Transportation Engineering-English Edition\",\"volume\":\"10 2\",\"pages\":\"Pages 276-290\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Traffic and Transportation Engineering-English Edition\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2095756423000272\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Traffic and Transportation Engineering-English Edition","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095756423000272","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 2

摘要

为了实现路面病害的自动、快速、高效、高精度分类与检测，训练了基于深度学习的路面病害图像分类与检测模型。首先，建立了路面遇险图像数据集，包括9017张遇险图片和9620张无遇险图片。这些照片是在中国3个省的4条沥青公路上拍摄的。在每个路面破损图像中，都存在一种或多种类型的破损，包括短吻鳄裂缝、纵向裂缝、块体裂缝、横向裂缝、坑洞和斑块。图片上的一个矩形边框标记了这些痛苦。然后分别使用ResNet网络和VGG网络作为痛苦和非痛苦想象分类的二元分类模型，以及六种痛苦分类的多标签分类模型。训练技术，如数据扩充、批量归一化、丢弃、动量、权重衰减、迁移学习和判别学习率被用于训练模型。在本研究考虑的4个细胞神经网络中，即ResNet 34和50，以及VGG 16和19，对于二元分类，ResNet 50的准确度最高，为96.243%，精密度为95.183%，ResNet 34的召回率最高，为97.824%，F2得分为97.052%。对于多标签分类，ResNet 50的性能最好，最高准确度为90.257%，高于中国道路病害检测标准（JTG H20-2018）要求的90%，F2得分−82.231%，精度−76.509%，ResNet 34的召回率最高，为87.32%。当并集交集（IoU）设置为0、0.25、0.50、0.75时，模型的平均精度（mAP）分别为74.881%、50.511%、28.432%、3.969%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated classification and detection of multiple pavement distress images based on deep learning

To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and F2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, F2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Traffic and Transportation Engineering-English Edition TRANSPORTATION SCIENCE & TECHNOLOGY-

CiteScore

13.60

自引率

6.30%

发文量

402

审稿时长

15 weeks

期刊介绍： The Journal of Traffic and Transportation Engineering (English Edition) serves as a renowned academic platform facilitating the exchange and exploration of innovative ideas in the realm of transportation. Our journal aims to foster theoretical and experimental research in transportation and welcomes the submission of exceptional peer-reviewed papers on engineering, planning, management, and information technology. We are dedicated to expediting the peer review process and ensuring timely publication of top-notch research in this field.