Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang
{"title":"基于深度学习的多幅路面破损图像的自动分类与检测","authors":"Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang","doi":"10.1016/j.jtte.2021.04.008","DOIUrl":null,"url":null,"abstract":"<div><p>To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and <em>F</em>2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, <em>F</em>2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.</p></div>","PeriodicalId":47239,"journal":{"name":"Journal of Traffic and Transportation Engineering-English Edition","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automated classification and detection of multiple pavement distress images based on deep learning\",\"authors\":\"Deru Li , Zhongdong Duan , Xiaoyang Hu , Dongchang Zhang , Yiying Zhang\",\"doi\":\"10.1016/j.jtte.2021.04.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and <em>F</em>2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, <em>F</em>2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.</p></div>\",\"PeriodicalId\":47239,\"journal\":{\"name\":\"Journal of Traffic and Transportation Engineering-English Edition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Traffic and Transportation Engineering-English Edition\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2095756423000272\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Traffic and Transportation Engineering-English Edition","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2095756423000272","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
Automated classification and detection of multiple pavement distress images based on deep learning
To achieve automatic, fast, efficient and high-precision pavement distress classification and detection, road surface distress image classification and detection models based on deep learning are trained. First, a pavement distress image dataset is built, including 9017 pictures with distress, and 9620 pictures without distress. These pictures were captured from 4 asphalt highways of 3 provinces in China. In each pavement distress image, there exists one or more types of distress, including alligator crack, longitudinal crack, block crack, transverse crack, pothole and patch. The distresses are labeled by a rectangle bounding box on the pictures. Then ResNet networks and VGG networks are used respectively as binary classification models for distressed and non-distressed imagines classification, and as multi-label classification models for six types of distress classification. Training techniques, such as data augmentation, batch normalization, dropout, momentum, weight decay, transfer learning, and discriminative learning rate are used in training the model. Among the 4 CNNs considered in this study, namely ResNet 34 and 50, and VGG 16 and 19, for the binary classification, ResNet 50 has the highest Accuracy of 96.243%, Precision of 95.183%, and ResNet 34 has the highest Recall of 97.824%, and F2 score of 97.052%. For multi-label classification, ResNet 50 has the best performance, with the highest Accuracy of 90.257%, higher than 90% required by the Chinese standard (JTG H20-2018) for road distresses detection, F2 score −82.231%, and Precision −76.509%, and ResNet 34 has the highest Recall of 87.32%. To locate and quantify the distress areas in the images, the single shot multibox detector (SSD) model is developed, in which the ResNet 50 is used as the base network to extract features. When the intersection over union (IoU) is set to 0, 0.25, 0.50, 0.75, the mean average precision (mAP) of the model are found to be 74.881%, 50.511%, 28.432%, 3.969%, respectively.
期刊介绍:
The Journal of Traffic and Transportation Engineering (English Edition) serves as a renowned academic platform facilitating the exchange and exploration of innovative ideas in the realm of transportation. Our journal aims to foster theoretical and experimental research in transportation and welcomes the submission of exceptional peer-reviewed papers on engineering, planning, management, and information technology. We are dedicated to expediting the peer review process and ensuring timely publication of top-notch research in this field.