{"title":"YOLOv8-MAH: Multi-attribute recognition model for Vehicles","authors":"Yazhou Zhao , Hongdong Zhao , Jianfeng Shi","doi":"10.1016/j.patcog.2025.111849","DOIUrl":null,"url":null,"abstract":"<div><div>Vehicle multi-attribute recognition tasks have been increasingly used in intelligent traffic management, but the intra-class variability and inter-class similarity among vehicles bring great difficulties to vehicle multi-attribute recognition. To address this challenge, this paper proposes an improved model named YOLOv8-MAH (YOLOv8 Multi-Attribute-Head), which aims to enhance the performance of multi-attribute recognition. In order to utilize the ability of transformer encoder to accurately obtain detailed information, the C2f (CSP Bottleneck 2 Convolution) module in the backbone network is replaced by the global channel module of MobileViT, at the same time the C2f-E module based on the EMA architecture (Efficient Multi-scale Attention) is designed to improve the ability of the network to recognize different attributes, and we also add an additional detection layer to better extract information from the detailed part of the image to identify more attributes. Furthermore, our self-made dataset is labeled in three perspectives: vehicle brand, color, and direction, and is divided into 144 categories. The experiment results show that the YOLOv8-MAH significantly achieves good performance in the vehicle multi-recognition task.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111849"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325005096","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Vehicle multi-attribute recognition tasks have been increasingly used in intelligent traffic management, but the intra-class variability and inter-class similarity among vehicles bring great difficulties to vehicle multi-attribute recognition. To address this challenge, this paper proposes an improved model named YOLOv8-MAH (YOLOv8 Multi-Attribute-Head), which aims to enhance the performance of multi-attribute recognition. In order to utilize the ability of transformer encoder to accurately obtain detailed information, the C2f (CSP Bottleneck 2 Convolution) module in the backbone network is replaced by the global channel module of MobileViT, at the same time the C2f-E module based on the EMA architecture (Efficient Multi-scale Attention) is designed to improve the ability of the network to recognize different attributes, and we also add an additional detection layer to better extract information from the detailed part of the image to identify more attributes. Furthermore, our self-made dataset is labeled in three perspectives: vehicle brand, color, and direction, and is divided into 144 categories. The experiment results show that the YOLOv8-MAH significantly achieves good performance in the vehicle multi-recognition task.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.