Kunho Kim, Min-jae Kim, Hyungtae Kim, Seokmok Park, J. Paik
{"title":"Person Re-identification Method Using Text Description Through CLIP","authors":"Kunho Kim, Min-jae Kim, Hyungtae Kim, Seokmok Park, J. Paik","doi":"10.1109/ICEIC57457.2023.10049924","DOIUrl":null,"url":null,"abstract":"A typical person re-identification (Re-ID) system works by taking person image as a query to find the most similar person among the images inside the gallery. From this, the system’s performance depends heavily on the quality of the query image. We see the hint to overcome this limitation in recent surprising progress in multi-modal learning between vision and language. In this context, this paper proposes a person re-identification method that utilizes text guidance via the Contrastive Language-Image Pre-training (CLIP). To fully utilize CLIP, we show how to transfer their knowledge to person Re-ID network. Experimental results prove the superior performance of our method on person Re-ID.","PeriodicalId":373752,"journal":{"name":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIC57457.2023.10049924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A typical person re-identification (Re-ID) system works by taking person image as a query to find the most similar person among the images inside the gallery. From this, the system’s performance depends heavily on the quality of the query image. We see the hint to overcome this limitation in recent surprising progress in multi-modal learning between vision and language. In this context, this paper proposes a person re-identification method that utilizes text guidance via the Contrastive Language-Image Pre-training (CLIP). To fully utilize CLIP, we show how to transfer their knowledge to person Re-ID network. Experimental results prove the superior performance of our method on person Re-ID.