从多视图图像中重建三维形状作为盒子的联合

Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems Pub Date : 2023-07-28 DOI:10.1145/3609703.3609705

Zihan Yang, Minglun Gong

{"title":"从多视图图像中重建三维形状作为盒子的联合","authors":"Zihan Yang, Minglun Gong","doi":"10.1145/3609703.3609705","DOIUrl":null,"url":null,"abstract":"The task of reconstructing object shapes from input images has become increasingly important in various fields, such as computer vision, robotics, augmented reality, video games, and autonomous vehicles. While approaches for reconstructing shapes with varying levels of detail have been proposed, balancing representation accuracy and model complexity remains a challenge. To address this challenge, we propose an end-to-end approach for reconstructing object shapes from multiple images using a union of box primitives. Our approach offers a simpler and more efficient 3D representation of objects without the need for intermediate products such as voxels, resulting in faster inference times. Additionally, we introduce an auxiliary task to aid in learning how to extract and transform spatial features from images without requiring camera calibrations. Extensive experiments demonstrate that our method can produce comparable results to approaches that require 3D voxelized input while utilizing only 2D RGB images as input. Furthermore, our method significantly outperforms the aforementioned approaches in terms of inference time.","PeriodicalId":101485,"journal":{"name":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reconstructing 3D Shapes as an Union of Boxes from Multi-View Images\",\"authors\":\"Zihan Yang, Minglun Gong\",\"doi\":\"10.1145/3609703.3609705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The task of reconstructing object shapes from input images has become increasingly important in various fields, such as computer vision, robotics, augmented reality, video games, and autonomous vehicles. While approaches for reconstructing shapes with varying levels of detail have been proposed, balancing representation accuracy and model complexity remains a challenge. To address this challenge, we propose an end-to-end approach for reconstructing object shapes from multiple images using a union of box primitives. Our approach offers a simpler and more efficient 3D representation of objects without the need for intermediate products such as voxels, resulting in faster inference times. Additionally, we introduce an auxiliary task to aid in learning how to extract and transform spatial features from images without requiring camera calibrations. Extensive experiments demonstrate that our method can produce comparable results to approaches that require 3D voxelized input while utilizing only 2D RGB images as input. Furthermore, our method significantly outperforms the aforementioned approaches in terms of inference time.\",\"PeriodicalId\":101485,\"journal\":{\"name\":\"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3609703.3609705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3609703.3609705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

从输入图像中重建物体形状的任务在计算机视觉、机器人、增强现实、视频游戏和自动驾驶汽车等各个领域变得越来越重要。虽然已经提出了重建具有不同细节级别的形状的方法，但平衡表示精度和模型复杂性仍然是一个挑战。为了解决这一挑战，我们提出了一种端到端方法，用于使用盒原语的联合从多个图像中重建物体形状。我们的方法提供了一种更简单、更有效的物体3D表示，而不需要体素等中间产品，从而加快了推理时间。此外，我们还引入了一个辅助任务来帮助学习如何在不需要相机校准的情况下从图像中提取和转换空间特征。大量的实验表明，我们的方法可以产生与只使用2D RGB图像作为输入而需要3D体素化输入的方法相当的结果。此外，我们的方法在推理时间方面明显优于上述方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reconstructing 3D Shapes as an Union of Boxes from Multi-View Images

The task of reconstructing object shapes from input images has become increasingly important in various fields, such as computer vision, robotics, augmented reality, video games, and autonomous vehicles. While approaches for reconstructing shapes with varying levels of detail have been proposed, balancing representation accuracy and model complexity remains a challenge. To address this challenge, we propose an end-to-end approach for reconstructing object shapes from multiple images using a union of box primitives. Our approach offers a simpler and more efficient 3D representation of objects without the need for intermediate products such as voxels, resulting in faster inference times. Additionally, we introduce an auxiliary task to aid in learning how to extract and transform spatial features from images without requiring camera calibrations. Extensive experiments demonstrate that our method can produce comparable results to approaches that require 3D voxelized input while utilizing only 2D RGB images as input. Furthermore, our method significantly outperforms the aforementioned approaches in terms of inference time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2023 5th International Conference on Pattern Recognition and Intelligent Systems

自引率

0.00%

发文量