Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features

Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09 Pub Date : 2009-07-08 DOI:10.1145/1646396.1646430

T. Furuya, Ryutarou Ohbuchi

{"title":"Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features","authors":"T. Furuya, Ryutarou Ohbuchi","doi":"10.1145/1646396.1646430","DOIUrl":null,"url":null,"abstract":"Our previous shape-based 3D model retrieval algorithm compares 3D shapes by using thousands of local visual features per model. A 3D model is rendered into a set of depth images, and from each image, local visual features are extracted by using the Scale Invariant Feature Transform (SIFT) algorithm by Lowe. To efficiently compare among large sets of local features, the algorithm employs bag-of-features approach to integrate the local features into a feature vector per model. The algorithm outperformed other methods for a dataset containing highly articulated yet geometrically simple 3D models. For a dataset containing diverse and detailed models, the method did only as well as other methods. This paper proposes an improved algorithm that performs equal or better than our previous method for both articulated and rigid but geometrically detailed models. The proposed algorithm extracts much larger number of local visual features by sampling each depth image densely and randomly. To contain computational cost, the method utilizes GPU for SIFT feature extraction and an efficient randomized decision tree for encoding SIFT features into visual words. Empirical evaluation showed that the proposed method is very fast, yet significantly outperforms our previous method for rigid and geometrically detailed models. For the simple yet articulated models, the performance was virtually unchanged.","PeriodicalId":347785,"journal":{"name":"Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"142","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1646396.1646430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 142

Abstract

Our previous shape-based 3D model retrieval algorithm compares 3D shapes by using thousands of local visual features per model. A 3D model is rendered into a set of depth images, and from each image, local visual features are extracted by using the Scale Invariant Feature Transform (SIFT) algorithm by Lowe. To efficiently compare among large sets of local features, the algorithm employs bag-of-features approach to integrate the local features into a feature vector per model. The algorithm outperformed other methods for a dataset containing highly articulated yet geometrically simple 3D models. For a dataset containing diverse and detailed models, the method did only as well as other methods. This paper proposes an improved algorithm that performs equal or better than our previous method for both articulated and rigid but geometrically detailed models. The proposed algorithm extracts much larger number of local visual features by sampling each depth image densely and randomly. To contain computational cost, the method utilizes GPU for SIFT feature extraction and an efficient randomized decision tree for encoding SIFT features into visual words. Empirical evaluation showed that the proposed method is very fast, yet significantly outperforms our previous method for rigid and geometrically detailed models. For the simple yet articulated models, the performance was virtually unchanged.

查看原文本刊更多论文

基于视觉特征袋的三维模型检索密集采样与快速编码

我们之前基于形状的3D模型检索算法通过使用每个模型的数千个局部视觉特征来比较3D形状。将三维模型渲染成一组深度图像，并使用Lowe的尺度不变特征变换(SIFT)算法从每个图像中提取局部视觉特征。为了在大量的局部特征集之间进行有效的比较，该算法采用特征袋方法将局部特征集成到每个模型的特征向量中。对于包含高度铰接但几何简单的3D模型的数据集，该算法优于其他方法。对于包含多种详细模型的数据集，该方法的效果与其他方法一样好。本文提出了一种改进的算法，该算法在铰接和刚性但几何详细的模型上的表现等于或优于我们以前的方法。该算法通过对每个深度图像进行密集随机采样，提取出更多的局部视觉特征。为了减少计算量，该方法利用GPU进行SIFT特征提取，并利用高效的随机决策树将SIFT特征编码为视觉词。经验评估表明，该方法的速度非常快，但在刚性和几何细节模型上明显优于我们之前的方法。对于简单但铰接的模型，性能几乎没有变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09

自引率

0.00%

发文量