Liviu-Daniel Stefan, Ionut Mironica, C. Mitrea, B. Ionescu
{"title":"End to end very deep person re-identification","authors":"Liviu-Daniel Stefan, Ionut Mironica, C. Mitrea, B. Ionescu","doi":"10.1109/ISSCS.2017.8034923","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Networks (CNNs) are responsible for major breakthroughs in object recognition in still images. This work presents an end to end very deep architecture with small convolutional kernel size, small convolutional strides and very deep network architecture for person re-identification in video streams. To achieve such system several good practices for the training were tested, namely: (i) training from scratch, (ii) pre-training last layer, (iii) small learning rates, (iv) data augmentation techniques, (v) high dropout ratio. The key contribution of this paper is a trainable, end-to-end deep network approach that allows for effective re-identification in real time of people in multiple-stream video from various sources (indoor and outdoor). Experimental evaluation was conducted on a real-world publicly available dataset showing the benefits of this approach.","PeriodicalId":338255,"journal":{"name":"2017 International Symposium on Signals, Circuits and Systems (ISSCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Symposium on Signals, Circuits and Systems (ISSCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSCS.2017.8034923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Convolutional Neural Networks (CNNs) are responsible for major breakthroughs in object recognition in still images. This work presents an end to end very deep architecture with small convolutional kernel size, small convolutional strides and very deep network architecture for person re-identification in video streams. To achieve such system several good practices for the training were tested, namely: (i) training from scratch, (ii) pre-training last layer, (iii) small learning rates, (iv) data augmentation techniques, (v) high dropout ratio. The key contribution of this paper is a trainable, end-to-end deep network approach that allows for effective re-identification in real time of people in multiple-stream video from various sources (indoor and outdoor). Experimental evaluation was conducted on a real-world publicly available dataset showing the benefits of this approach.