Nha Tran, Toan Nguyen, Minh Nguyen, Khiet Luong, Tai Lam
{"title":"Global-local attention with triplet loss and label smoothed crossentropy for person re-identification","authors":"Nha Tran, Toan Nguyen, Minh Nguyen, Khiet Luong, Tai Lam","doi":"10.11591/ijai.v12.i4.pp1883-1891","DOIUrl":null,"url":null,"abstract":"Person re-identification (Person Re-ID) is a research direction on tracking and identifying people in surveillance camera systems with non-overlapping camera perspectives. Despite much research on this topic, there are still some practical problems that Person Re-ID has not yet solved, in reality, human objects can easily be obscured by obstructions such as other people, trees, luggage, umbrellas, signs, cars, motorbikes. In this paper, we propose a multibranch deep learning network architecture. In which one branch is for the representation of global features and two branches are for the representation of local features. Dividing the input image into small parts and changing the number of parts between the two branches helps the model to represent the features better. In addition, we add an attention module to the ResNet50 backbone that enhances important human characteristics and eliminates irrelevant information. To improve robustness, the model is trained by combining triplet loss and label smoothing cross-entropy loss (LSCE). Experiments are carried out on datasets Market1501, and duke multi-target multi-camera (DukeMTMC) datasets, our method achieved 96.04% rank-1, 88,11% mean average precision (mAP) on the Market1501 dataset, and 88.78% rank-1, 78,6% mAP on the DukeMTMC dataset. This method achieves performance better than some state-of-the-art methods.","PeriodicalId":52221,"journal":{"name":"IAES International Journal of Artificial Intelligence","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v12.i4.pp1883-1891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Person re-identification (Person Re-ID) is a research direction on tracking and identifying people in surveillance camera systems with non-overlapping camera perspectives. Despite much research on this topic, there are still some practical problems that Person Re-ID has not yet solved, in reality, human objects can easily be obscured by obstructions such as other people, trees, luggage, umbrellas, signs, cars, motorbikes. In this paper, we propose a multibranch deep learning network architecture. In which one branch is for the representation of global features and two branches are for the representation of local features. Dividing the input image into small parts and changing the number of parts between the two branches helps the model to represent the features better. In addition, we add an attention module to the ResNet50 backbone that enhances important human characteristics and eliminates irrelevant information. To improve robustness, the model is trained by combining triplet loss and label smoothing cross-entropy loss (LSCE). Experiments are carried out on datasets Market1501, and duke multi-target multi-camera (DukeMTMC) datasets, our method achieved 96.04% rank-1, 88,11% mean average precision (mAP) on the Market1501 dataset, and 88.78% rank-1, 78,6% mAP on the DukeMTMC dataset. This method achieves performance better than some state-of-the-art methods.