三维模型重建的图像分类性能评价

A. Yuniarti, N. Suciati, A. Arifin
{"title":"三维模型重建的图像分类性能评价","authors":"A. Yuniarti, N. Suciati, A. Arifin","doi":"10.1109/ICRAMET51080.2020.9298643","DOIUrl":null,"url":null,"abstract":"3D reconstruction of 2D images is a classical problem in computer vision. Conventional methods have been proposed using multiple image registration, intrinsic and extrinsic camera parameter estimation, and optimization methods. Recently, the availability of a 3D dataset publicly shared has encouraged a deep-learning-based method for single-view reconstruction. One approach was by employing direct image encoding to the 3D point decoding approach as in PointNet and AtlasNet. Some other research directions attempted to retrieve 3D data using deep-learning-based methods, which employed a convolutional neural network (CNN). However, the use of CNN in image classification specific for the 3D reconstruction task still needs to be investigated because, usually, CNN was used as an image encoder in an auto-encoder setting instead of a classification module in a point generation network. Moreover, there is a lack of reports on the performance evaluation of deep-learning-based method on images rendered from 3D data, such as ShapeNet rendering images. In this paper, we implemented several deep-learning models to decode the ShapeNet rendering images that contain 13 model categories to examine the various hyper-parameters' impacts on each 3D model category. Our experiments showed that the hyper-parameters of the learning rate and epochs set to either 0.001 or 0.0001 and 60-80 epochs significantly outperformed other settings. Moreover, we observed that regardless of network configuration, some categories (plane, watercraft, car) performed better throughout the study. Therefore, a 3D reconstruction based on image classification can be designed based on the best performing categories.","PeriodicalId":228482,"journal":{"name":"2020 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Image Classification Performance Evaluation for 3D Model Reconstruction\",\"authors\":\"A. Yuniarti, N. Suciati, A. Arifin\",\"doi\":\"10.1109/ICRAMET51080.2020.9298643\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"3D reconstruction of 2D images is a classical problem in computer vision. Conventional methods have been proposed using multiple image registration, intrinsic and extrinsic camera parameter estimation, and optimization methods. Recently, the availability of a 3D dataset publicly shared has encouraged a deep-learning-based method for single-view reconstruction. One approach was by employing direct image encoding to the 3D point decoding approach as in PointNet and AtlasNet. Some other research directions attempted to retrieve 3D data using deep-learning-based methods, which employed a convolutional neural network (CNN). However, the use of CNN in image classification specific for the 3D reconstruction task still needs to be investigated because, usually, CNN was used as an image encoder in an auto-encoder setting instead of a classification module in a point generation network. Moreover, there is a lack of reports on the performance evaluation of deep-learning-based method on images rendered from 3D data, such as ShapeNet rendering images. In this paper, we implemented several deep-learning models to decode the ShapeNet rendering images that contain 13 model categories to examine the various hyper-parameters' impacts on each 3D model category. Our experiments showed that the hyper-parameters of the learning rate and epochs set to either 0.001 or 0.0001 and 60-80 epochs significantly outperformed other settings. Moreover, we observed that regardless of network configuration, some categories (plane, watercraft, car) performed better throughout the study. Therefore, a 3D reconstruction based on image classification can be designed based on the best performing categories.\",\"PeriodicalId\":228482,\"journal\":{\"name\":\"2020 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET)\",\"volume\":\"123 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRAMET51080.2020.9298643\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAMET51080.2020.9298643","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

二维图像的三维重建是计算机视觉中的一个经典问题。传统的方法包括多图像配准、内外部相机参数估计和优化方法。最近,公开共享的3D数据集的可用性鼓励了基于深度学习的单视图重建方法。一种方法是采用直接图像编码的3D点解码方法,如PointNet和AtlasNet。其他一些研究方向试图使用基于深度学习的方法检索3D数据,该方法采用卷积神经网络(CNN)。然而,针对3D重建任务,CNN在图像分类中的使用还需要进一步研究,因为CNN通常是在自编码器设置中作为图像编码器,而不是在点生成网络中作为分类模块。此外,关于基于深度学习的方法在3D数据渲染图像(如ShapeNet渲染图像)上的性能评估也缺乏报道。在本文中,我们实现了几个深度学习模型来解码包含13个模型类别的ShapeNet渲染图像,以检查各种超参数对每个3D模型类别的影响。我们的实验表明,学习率和epoch的超参数设置为0.001或0.0001以及60-80 epoch显着优于其他设置。此外,我们观察到,无论网络配置如何,某些类别(飞机、船只、汽车)在整个研究过程中表现得更好。因此,可以根据表现最好的类别设计基于图像分类的三维重建。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Image Classification Performance Evaluation for 3D Model Reconstruction
3D reconstruction of 2D images is a classical problem in computer vision. Conventional methods have been proposed using multiple image registration, intrinsic and extrinsic camera parameter estimation, and optimization methods. Recently, the availability of a 3D dataset publicly shared has encouraged a deep-learning-based method for single-view reconstruction. One approach was by employing direct image encoding to the 3D point decoding approach as in PointNet and AtlasNet. Some other research directions attempted to retrieve 3D data using deep-learning-based methods, which employed a convolutional neural network (CNN). However, the use of CNN in image classification specific for the 3D reconstruction task still needs to be investigated because, usually, CNN was used as an image encoder in an auto-encoder setting instead of a classification module in a point generation network. Moreover, there is a lack of reports on the performance evaluation of deep-learning-based method on images rendered from 3D data, such as ShapeNet rendering images. In this paper, we implemented several deep-learning models to decode the ShapeNet rendering images that contain 13 model categories to examine the various hyper-parameters' impacts on each 3D model category. Our experiments showed that the hyper-parameters of the learning rate and epochs set to either 0.001 or 0.0001 and 60-80 epochs significantly outperformed other settings. Moreover, we observed that regardless of network configuration, some categories (plane, watercraft, car) performed better throughout the study. Therefore, a 3D reconstruction based on image classification can be designed based on the best performing categories.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信