Jianwei Ke, Alex J Watras, Jae-Jun Kim, Hewei Liu, Hongrui Jiang, Yu Hen Hu
{"title":"利用多视角RGB相机阵列实现实时三维可视化。","authors":"Jianwei Ke, Alex J Watras, Jae-Jun Kim, Hewei Liu, Hongrui Jiang, Yu Hen Hu","doi":"10.1007/s11265-021-01729-0","DOIUrl":null,"url":null,"abstract":"<p><p>A real-time 3D visualization (RT3DV) system using a multiview RGB camera array is presented. RT3DV can process multiple synchronized video streams to produce a stereo video of a dynamic scene from a chosen view angle. Its design objective is to facilitate 3D visualization at the video frame rate with good viewing quality. To facilitate 3D vision, RT3DV estimates and updates a surface mesh model formed directly from a set of sparse key points. The 3D coordinates of these key points are estimated from matching 2D key points across multiview video streams with the aid of epipolar geometry and trifocal tensor. To capture the scene dynamics, 2D key points in individual video streams are tracked between successive frames. We implemented a proof of concept RT3DV system tasked to process five synchronous video streams acquired by an RGB camera array. It achieves a processing speed of 44 milliseconds per frame and a peak signal to noise ratio (PSNR) of 15.9 dB from a viewpoint coinciding with a reference view. As a comparison, an image-based MVS algorithm utilizing a dense point cloud model and frame by frame feature detection and matching will require 7 seconds to render a frame and yield a reference view PSNR of 16.3 dB.</p>","PeriodicalId":50050,"journal":{"name":"Journal of Signal Processing Systems for Signal Image and Video Technology","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9159639/pdf/nihms-1776819.pdf","citationCount":"2","resultStr":"{\"title\":\"Towards real-time 3D visualization with multiview RGB camera array.\",\"authors\":\"Jianwei Ke, Alex J Watras, Jae-Jun Kim, Hewei Liu, Hongrui Jiang, Yu Hen Hu\",\"doi\":\"10.1007/s11265-021-01729-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>A real-time 3D visualization (RT3DV) system using a multiview RGB camera array is presented. RT3DV can process multiple synchronized video streams to produce a stereo video of a dynamic scene from a chosen view angle. Its design objective is to facilitate 3D visualization at the video frame rate with good viewing quality. To facilitate 3D vision, RT3DV estimates and updates a surface mesh model formed directly from a set of sparse key points. The 3D coordinates of these key points are estimated from matching 2D key points across multiview video streams with the aid of epipolar geometry and trifocal tensor. To capture the scene dynamics, 2D key points in individual video streams are tracked between successive frames. We implemented a proof of concept RT3DV system tasked to process five synchronous video streams acquired by an RGB camera array. It achieves a processing speed of 44 milliseconds per frame and a peak signal to noise ratio (PSNR) of 15.9 dB from a viewpoint coinciding with a reference view. As a comparison, an image-based MVS algorithm utilizing a dense point cloud model and frame by frame feature detection and matching will require 7 seconds to render a frame and yield a reference view PSNR of 16.3 dB.</p>\",\"PeriodicalId\":50050,\"journal\":{\"name\":\"Journal of Signal Processing Systems for Signal Image and Video Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9159639/pdf/nihms-1776819.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Signal Processing Systems for Signal Image and Video Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11265-021-01729-0\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Signal Processing Systems for Signal Image and Video Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11265-021-01729-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Towards real-time 3D visualization with multiview RGB camera array.
A real-time 3D visualization (RT3DV) system using a multiview RGB camera array is presented. RT3DV can process multiple synchronized video streams to produce a stereo video of a dynamic scene from a chosen view angle. Its design objective is to facilitate 3D visualization at the video frame rate with good viewing quality. To facilitate 3D vision, RT3DV estimates and updates a surface mesh model formed directly from a set of sparse key points. The 3D coordinates of these key points are estimated from matching 2D key points across multiview video streams with the aid of epipolar geometry and trifocal tensor. To capture the scene dynamics, 2D key points in individual video streams are tracked between successive frames. We implemented a proof of concept RT3DV system tasked to process five synchronous video streams acquired by an RGB camera array. It achieves a processing speed of 44 milliseconds per frame and a peak signal to noise ratio (PSNR) of 15.9 dB from a viewpoint coinciding with a reference view. As a comparison, an image-based MVS algorithm utilizing a dense point cloud model and frame by frame feature detection and matching will require 7 seconds to render a frame and yield a reference view PSNR of 16.3 dB.
期刊介绍:
The Journal of Signal Processing Systems for Signal, Image, and Video Technology publishes research papers on the design and implementation of signal processing systems, with or without VLSI circuits. The journal is published in twelve issues and is distributed to engineers, researchers, and educators in the general field of signal processing systems.