Xin Yang, Qiong Liu, Chunyuan Liao, K. Cheng, Andreas Girgensohn
{"title":"Large-scale EMM identification based on geometry-constrained visual word correspondence voting","authors":"Xin Yang, Qiong Liu, Chunyuan Liao, K. Cheng, Andreas Girgensohn","doi":"10.1145/1991996.1992031","DOIUrl":null,"url":null,"abstract":"We present a large-scale Embedded Media Marker (EMM) identification system which allows users to retrieve relevant dynamic media associated with a static paper document via camera-phones. The user supplies a query image by capturing an EMM-signified patch of a paper document through a camera phone. The system recognizes the query and in turn retrieves and plays the corresponding media on the phone. Accurate image matching is crucial for positive user experience in this application. To address the challenges posed by large datasets and variation in camera-phone-captured query images, we introduce a novel image matching scheme based on geometrically consistent correspondences. A hierarchical scheme, combined with two constraining methods, is designed to detect geometric constrained correspondences between images. A spatial neighborhood search approach is further proposed to address challenging cases of query images with a large translational shift. Experimental results on a 200k+ dataset show that our solution achieves high accuracy with low memory and time complexity and outperforms the baseline bag-of-words approach.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1991996.1992031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
We present a large-scale Embedded Media Marker (EMM) identification system which allows users to retrieve relevant dynamic media associated with a static paper document via camera-phones. The user supplies a query image by capturing an EMM-signified patch of a paper document through a camera phone. The system recognizes the query and in turn retrieves and plays the corresponding media on the phone. Accurate image matching is crucial for positive user experience in this application. To address the challenges posed by large datasets and variation in camera-phone-captured query images, we introduce a novel image matching scheme based on geometrically consistent correspondences. A hierarchical scheme, combined with two constraining methods, is designed to detect geometric constrained correspondences between images. A spatial neighborhood search approach is further proposed to address challenging cases of query images with a large translational shift. Experimental results on a 200k+ dataset show that our solution achieves high accuracy with low memory and time complexity and outperforms the baseline bag-of-words approach.