Ruihan Xu, Anthony Opipari, Joshua Mah, Stanley Lewis, Haoran Zhang, Hanzhe Guo, Odest Chadwicke Jenkins
{"title":"通过 SO(2)-Equivariant 高斯雕刻网络进行单视角三维重建","authors":"Ruihan Xu, Anthony Opipari, Joshua Mah, Stanley Lewis, Haoran Zhang, Hanzhe Guo, Odest Chadwicke Jenkins","doi":"arxiv-2409.07245","DOIUrl":null,"url":null,"abstract":"This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as\nan approach for SO(2)-Equivariant 3D object reconstruction from single-view\nimage observations. GSNs take a single observation as input to generate a Gaussian splat\nrepresentation describing the observed object's geometry and texture. By using\na shared feature extractor before decoding Gaussian colors, covariances,\npositions, and opacities, GSNs achieve extremely high throughput (>150FPS).\nExperiments demonstrate that GSNs can be trained efficiently using a multi-view\nrendering loss and are competitive, in quality, with expensive diffusion-based\nreconstruction algorithms. The GSN model is validated on multiple benchmark\nexperiments. Moreover, we demonstrate the potential for GSNs to be used within\na robotic manipulation pipeline for object-centric grasping.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks\",\"authors\":\"Ruihan Xu, Anthony Opipari, Joshua Mah, Stanley Lewis, Haoran Zhang, Hanzhe Guo, Odest Chadwicke Jenkins\",\"doi\":\"arxiv-2409.07245\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as\\nan approach for SO(2)-Equivariant 3D object reconstruction from single-view\\nimage observations. GSNs take a single observation as input to generate a Gaussian splat\\nrepresentation describing the observed object's geometry and texture. By using\\na shared feature extractor before decoding Gaussian colors, covariances,\\npositions, and opacities, GSNs achieve extremely high throughput (>150FPS).\\nExperiments demonstrate that GSNs can be trained efficiently using a multi-view\\nrendering loss and are competitive, in quality, with expensive diffusion-based\\nreconstruction algorithms. The GSN model is validated on multiple benchmark\\nexperiments. Moreover, we demonstrate the potential for GSNs to be used within\\na robotic manipulation pipeline for object-centric grasping.\",\"PeriodicalId\":501031,\"journal\":{\"name\":\"arXiv - CS - Robotics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07245\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks
This paper introduces SO(2)-Equivariant Gaussian Sculpting Networks (GSNs) as
an approach for SO(2)-Equivariant 3D object reconstruction from single-view
image observations. GSNs take a single observation as input to generate a Gaussian splat
representation describing the observed object's geometry and texture. By using
a shared feature extractor before decoding Gaussian colors, covariances,
positions, and opacities, GSNs achieve extremely high throughput (>150FPS).
Experiments demonstrate that GSNs can be trained efficiently using a multi-view
rendering loss and are competitive, in quality, with expensive diffusion-based
reconstruction algorithms. The GSN model is validated on multiple benchmark
experiments. Moreover, we demonstrate the potential for GSNs to be used within
a robotic manipulation pipeline for object-centric grasping.