View sequence prediction GAN: unsupervised representation learning for 3D shapes by decomposing view content and viewpoint variance

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Heyu Zhou, Jiayu Li, Xianzhu Liu, Yingda Lyu, Haipeng Chen, An-An Liu
{"title":"View sequence prediction GAN: unsupervised representation learning for 3D shapes by decomposing view content and viewpoint variance","authors":"Heyu Zhou, Jiayu Li, Xianzhu Liu, Yingda Lyu, Haipeng Chen, An-An Liu","doi":"10.1007/s00530-024-01431-8","DOIUrl":null,"url":null,"abstract":"<p>Unsupervised representation learning for 3D shapes has become a critical problem for large-scale 3D shape management. Recent model-based methods for this task require additional information for training, while popular view-based methods often overlook viewpoint variance in view prediction, leading to uninformative 3D features that limit their practical applications. To address these issues, we propose an unsupervised 3D shape representation learning method called View Sequence Prediction GAN (VSP-GAN), which decomposes view content and viewpoint variance. VSP-GAN takes several adjacent views of a 3D shape as input and outputs the subsequent views. The key idea is to split the multi-view sequence into two available perceptible parts, view content and viewpoint variance, and independently encode them with separate encoders. With the information, we design a decoder implemented by the mirrored architecture of the content encoder to predict the view sequence by multi-steps. Besides, to improve the quality of the reconstructed views, we propose a novel hierarchical view prediction loss to enhance view realism, semantic consistency, and details retainment. We evaluate the proposed VSP-GAN on two popular 3D CAD datasets, ModelNet10 and ModelNet40, for 3D shape classification and retrieval. The experimental results demonstrate that our VSP-GAN can learn more discriminative features than the state-of-the-art methods.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01431-8","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Unsupervised representation learning for 3D shapes has become a critical problem for large-scale 3D shape management. Recent model-based methods for this task require additional information for training, while popular view-based methods often overlook viewpoint variance in view prediction, leading to uninformative 3D features that limit their practical applications. To address these issues, we propose an unsupervised 3D shape representation learning method called View Sequence Prediction GAN (VSP-GAN), which decomposes view content and viewpoint variance. VSP-GAN takes several adjacent views of a 3D shape as input and outputs the subsequent views. The key idea is to split the multi-view sequence into two available perceptible parts, view content and viewpoint variance, and independently encode them with separate encoders. With the information, we design a decoder implemented by the mirrored architecture of the content encoder to predict the view sequence by multi-steps. Besides, to improve the quality of the reconstructed views, we propose a novel hierarchical view prediction loss to enhance view realism, semantic consistency, and details retainment. We evaluate the proposed VSP-GAN on two popular 3D CAD datasets, ModelNet10 and ModelNet40, for 3D shape classification and retrieval. The experimental results demonstrate that our VSP-GAN can learn more discriminative features than the state-of-the-art methods.

Abstract Image

视图序列预测 GAN:通过分解视图内容和视点差异对三维形状进行无监督表示学习
三维形状的无监督表示学习已成为大规模三维形状管理的关键问题。最近的基于模型的方法需要额外的信息进行训练,而流行的基于视图的方法在视图预测中往往忽略了视点差异,导致三维特征信息量不足,限制了其实际应用。为了解决这些问题,我们提出了一种名为视图序列预测 GAN(VSP-GAN)的无监督三维形状表示学习方法,它能分解视图内容和视点差异。VSP-GAN 将三维形状的多个相邻视图作为输入,并输出后续视图。其主要思路是将多视图序列拆分成两个可感知的部分,即视图内容和视点差异,并分别用不同的编码器对其进行编码。利用这些信息,我们设计了一个解码器,通过内容编码器的镜像架构来实现多步骤预测视图序列。此外,为了提高重建视图的质量,我们提出了一种新颖的分层视图预测损失,以增强视图的真实性、语义一致性和细节保留。我们在两个流行的三维 CAD 数据集 ModelNet10 和 ModelNet40 上对所提出的 VSP-GAN 进行了评估,以进行三维形状分类和检索。实验结果表明,与最先进的方法相比,我们的 VSP-GAN 可以学习到更多的判别特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信