Road-pavement classification by artificial neural network model based on tire-pavement noise and road-surface image

IF 3.4 2区物理与天体物理 Q1 ACOUSTICS

Applied Acoustics Pub Date : 2024-08-01 DOI:10.1016/j.apacoust.2024.110194

{"title":"Road-pavement classification by artificial neural network model based on tire-pavement noise and road-surface image","authors":"","doi":"10.1016/j.apacoust.2024.110194","DOIUrl":null,"url":null,"abstract":"<div><p>This study focuses on an artificial neural network (ANN) model for classifying pavement types using acoustic and image data. While conventional studies often use road-surface images for pavement classification, they face challenges with image quality degradation owing to external factors, such as sunlight angle, shadows, and lighting. Therefore, in this study, tire-pavement noise, which has different noise characteristics depending on the material and surface treatment, is used independently and in conjunction with image data for ANN training. To construct the training dataset, tire-pavement noise, and road-surface images are collected from 11 highway sampling sites in South Korea. Two simultaneous measurements are used: the tire-pavement noise is collected using the On-board sound intensity (OBSI) method, and the camera captures the road-surface images. 1/3 octave SIL, spectrum, MFCC, GLCM, and HOG are extracted from the raw data, and the ANN models are trained by these features. Using the spectrum as an input feature for the ANN yields a classification accuracy of 95.18%. However, the total number of parameters in the ANN is double that of the other models. To reduce the ANN size, 1/3 octave band SIL is used for training, and the model size is halved. However, the accuracy decreases by 13.47 percentage points. To overcome this significant decrease, the 1/3 octave bands SIL and image features were used to train ANN, simultaneously. This approach increases the accuracy by 93.85%. By training the ANN using MFCC, which is commonly used as an acoustic feature in other machine learning studies, the highest classification accuracy of 96.84% is achieved. Additionally, MFCC models are affected by the number of coefficients and the signal length. To include the dominant frequency of tire-pavement noise, more than 13 coefficients are used, a number generally known to be suitable for speech recognition. Increasing the number of coefficients from 13 to 40 improves accuracy by 1.17 percentage points. The interval for slicing raw WAV files is reduced to increase the training data and classify the pavement using shorter signals without statistically significant accuracy loss. Although accuracy does not decrease until the signal lengths reach 0.5 seconds, it rapidly decreases when the signal lengths become shorter than 0.4 seconds.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X24003451","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

This study focuses on an artificial neural network (ANN) model for classifying pavement types using acoustic and image data. While conventional studies often use road-surface images for pavement classification, they face challenges with image quality degradation owing to external factors, such as sunlight angle, shadows, and lighting. Therefore, in this study, tire-pavement noise, which has different noise characteristics depending on the material and surface treatment, is used independently and in conjunction with image data for ANN training. To construct the training dataset, tire-pavement noise, and road-surface images are collected from 11 highway sampling sites in South Korea. Two simultaneous measurements are used: the tire-pavement noise is collected using the On-board sound intensity (OBSI) method, and the camera captures the road-surface images. 1/3 octave SIL, spectrum, MFCC, GLCM, and HOG are extracted from the raw data, and the ANN models are trained by these features. Using the spectrum as an input feature for the ANN yields a classification accuracy of 95.18%. However, the total number of parameters in the ANN is double that of the other models. To reduce the ANN size, 1/3 octave band SIL is used for training, and the model size is halved. However, the accuracy decreases by 13.47 percentage points. To overcome this significant decrease, the 1/3 octave bands SIL and image features were used to train ANN, simultaneously. This approach increases the accuracy by 93.85%. By training the ANN using MFCC, which is commonly used as an acoustic feature in other machine learning studies, the highest classification accuracy of 96.84% is achieved. Additionally, MFCC models are affected by the number of coefficients and the signal length. To include the dominant frequency of tire-pavement noise, more than 13 coefficients are used, a number generally known to be suitable for speech recognition. Increasing the number of coefficients from 13 to 40 improves accuracy by 1.17 percentage points. The interval for slicing raw WAV files is reduced to increase the training data and classify the pavement using shorter signals without statistically significant accuracy loss. Although accuracy does not decrease until the signal lengths reach 0.5 seconds, it rapidly decreases when the signal lengths become shorter than 0.4 seconds.

查看原文本刊更多论文

基于轮胎路面噪声和路面图像的人工神经网络路面分类模型

本研究的重点是利用声学和图像数据对路面类型进行分类的人工神经网络（ANN）模型。传统研究通常使用路面图像进行路面分类，但由于日照角度、阴影和光照等外部因素，图像质量下降是其面临的挑战。因此，在本研究中，独立使用了轮胎路面噪声，并将其与图像数据一起用于 ANN 训练，轮胎路面噪声因材料和表面处理不同而具有不同的噪声特征。为了构建训练数据集，从韩国的 11 个高速公路采样点收集了轮胎路面噪声和路面图像。使用了两种同步测量方法：使用车载声强（OBSI）方法收集轮胎路面噪声，摄像头捕捉路面图像。从原始数据中提取 1/3 倍频程 SIL、频谱、MFCC、GLCM 和 HOG，并通过这些特征训练 ANN 模型。使用频谱作为 ANN 的输入特征，分类准确率达到 95.18%。然而，ANN 的参数总数是其他模型的两倍。为了缩小方差网络的规模，使用 1/3 倍频程频带 SIL 进行训练，模型规模减半。然而，精确度下降了 13.47 个百分点。为了克服这一显著下降，我们同时使用 1/3 倍频程带 SIL 和图像特征来训练 ANN。这种方法将准确率提高了 93.85%。在其他机器学习研究中，MFCC 通常被用作声学特征，通过使用 MFCC 对 ANN 进行训练，分类准确率达到了最高的 96.84%。此外，MFCC 模型会受到系数数量和信号长度的影响。为了包含轮胎路面噪音的主频，使用了 13 个以上的系数，众所周知，这个数量适合语音识别。将系数数从 13 个增加到 40 个，准确率提高了 1.17 个百分点。缩短原始 WAV 文件的切片时间间隔，以增加训练数据，并使用较短的信号对路面进行分类，而不会在统计上造成显著的准确率损失。虽然在信号长度达到 0.5 秒之前，准确度不会下降，但当信号长度短于 0.4 秒时，准确度会迅速下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Acoustics 物理-声学

CiteScore

7.40

自引率

11.80%

发文量

618

审稿时长

7.5 months

期刊介绍： Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense. Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems. Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.