Da-Cheng Juan, Chun-Ta Lu, Zhuguo Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Y. Gao, Tom Duerig, A. Tomkins, Sujith Ravi
{"title":"Ultra Fine-Grained Image Semantic Embedding","authors":"Da-Cheng Juan, Chun-Ta Lu, Zhuguo Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Y. Gao, Tom Duerig, A. Tomkins, Sujith Ravi","doi":"10.1145/3336191.3371784","DOIUrl":null,"url":null,"abstract":"\"How to learn image embeddings that capture fine-grained semantics based on the instance of an image?\" \"Is it possible for such embeddings to further understand image semantics closer to humans' perception?\" In this paper, we present, Graph-Regularized Image Semantic Embedding (Graph-RISE), a web-scale neural graph learning framework deployed at Google, which allows us to train image embeddings to discriminate an unprecedented O(40M) ultra-fine-grained semantic labels. The proposed Graph-RISE outperforms state-of-the-art image embedding algorithms on several evaluation tasks, including kNN search and triplet ranking: the accuracy is improved by approximately 2X on the ImageNet dataset and by more than 5X on the iNaturalist dataset. Qualitatively, image retrieval from one billion images based on the proposed Graph-RISE effectively captures semantics and, compared to the state-of-the-art, differentiates nuances at levels that are closer to human-perception.","PeriodicalId":319008,"journal":{"name":"Proceedings of the 13th International Conference on Web Search and Data Mining","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3336191.3371784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
"How to learn image embeddings that capture fine-grained semantics based on the instance of an image?" "Is it possible for such embeddings to further understand image semantics closer to humans' perception?" In this paper, we present, Graph-Regularized Image Semantic Embedding (Graph-RISE), a web-scale neural graph learning framework deployed at Google, which allows us to train image embeddings to discriminate an unprecedented O(40M) ultra-fine-grained semantic labels. The proposed Graph-RISE outperforms state-of-the-art image embedding algorithms on several evaluation tasks, including kNN search and triplet ranking: the accuracy is improved by approximately 2X on the ImageNet dataset and by more than 5X on the iNaturalist dataset. Qualitatively, image retrieval from one billion images based on the proposed Graph-RISE effectively captures semantics and, compared to the state-of-the-art, differentiates nuances at levels that are closer to human-perception.