M. Eisemann, Jan-Michael Frahm, Y. Rémion, Muhannad Ismaël
{"title":"Reconstruction of Dense Correspondences","authors":"M. Eisemann, Jan-Michael Frahm, Y. Rémion, Muhannad Ismaël","doi":"10.1201/b18154-11","DOIUrl":null,"url":null,"abstract":"This chapter concentrates on dense image correspondence estimation with a special focus on stereo. Images are the basic input for a vast majority of algorithms dealing with the reconstruction of the real world. To analyze a scene from a collection of images it becomes inevitable to put these images into correspondence. These correspondences then form the basis for many subsequent analyses, including camera calibration, stereo and 3D reconstruction, motion information, scene flow and others. While some of these tasks like camera calibration require only sparse correspondences between the images, Chapter 7, others require per-pixel correspondence, also known as dense correspondence estimation. Humans are extremely good at solving the correspondence problem which most of them do all the time during depth perception. Basically, the eyes serve as two cameras, slightly displaced, with respect to each other, that capture the surrounding from two different viewpoints. When focusing on an object at a certain distance one has already computed an estimate of the distance in the brain and therefore of the object’s position in space. It turns out the same problem is quite difficult for a computer and has been researched for several decades now. The difficulty in correspondence estimation is caused by several factors: images are often corrupted by sensor noise, e.g. when recorded in a poorly lit environment Section 1.1; the captured scene signal is discretized and represented by some finite image resolution; not every pixel actually has a correspondencing partner in the other views as it might be occluded; and ambiguities due to the absence of texture are difficult to solve. If one can solve the dense correspondence problem a variety of different applications becomes possible especially in the field of computer vision. Robot navigation and autonomous cars require depth perception to avoid obstacles [Giachetti et al. 98, Kastrinaki et al. 03]. Quality assurance in industrial applications is often based on stereo algorithms to detect cracks","PeriodicalId":141890,"journal":{"name":"Digital Representations of the Real World","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Representations of the Real World","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1201/b18154-11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This chapter concentrates on dense image correspondence estimation with a special focus on stereo. Images are the basic input for a vast majority of algorithms dealing with the reconstruction of the real world. To analyze a scene from a collection of images it becomes inevitable to put these images into correspondence. These correspondences then form the basis for many subsequent analyses, including camera calibration, stereo and 3D reconstruction, motion information, scene flow and others. While some of these tasks like camera calibration require only sparse correspondences between the images, Chapter 7, others require per-pixel correspondence, also known as dense correspondence estimation. Humans are extremely good at solving the correspondence problem which most of them do all the time during depth perception. Basically, the eyes serve as two cameras, slightly displaced, with respect to each other, that capture the surrounding from two different viewpoints. When focusing on an object at a certain distance one has already computed an estimate of the distance in the brain and therefore of the object’s position in space. It turns out the same problem is quite difficult for a computer and has been researched for several decades now. The difficulty in correspondence estimation is caused by several factors: images are often corrupted by sensor noise, e.g. when recorded in a poorly lit environment Section 1.1; the captured scene signal is discretized and represented by some finite image resolution; not every pixel actually has a correspondencing partner in the other views as it might be occluded; and ambiguities due to the absence of texture are difficult to solve. If one can solve the dense correspondence problem a variety of different applications becomes possible especially in the field of computer vision. Robot navigation and autonomous cars require depth perception to avoid obstacles [Giachetti et al. 98, Kastrinaki et al. 03]. Quality assurance in industrial applications is often based on stereo algorithms to detect cracks
本章集中讨论密集图像对应估计,并特别关注立体图像。图像是处理真实世界重建的绝大多数算法的基本输入。要从一组图像中分析一个场景,就不可避免地要把这些图像对应起来。这些对应然后形成了许多后续分析的基础,包括相机校准,立体和3D重建,运动信息,场景流等。虽然其中一些任务(如相机校准)只需要图像之间的稀疏对应,但其他任务需要每像素对应,也称为密集对应估计。人类在深度感知过程中非常擅长解决对应问题,这也是大多数人一直在做的事情。基本上,眼睛就像两个相机,彼此之间有轻微的位移,从两个不同的角度捕捉周围的环境。当聚焦在某一距离的物体上时,人们已经在大脑中计算出了距离的估计值,从而计算出了物体在空间中的位置。事实证明,同样的问题对于计算机来说是相当困难的,并且已经研究了几十年。对应估计的困难是由几个因素引起的:图像经常被传感器噪声损坏,例如,当在光线不足的环境中记录时(章节1.1);将采集到的场景信号离散化,用有限的图像分辨率表示;并非每个像素在其他视图中都有对应的伙伴,因为它可能被遮挡;由于缺少纹理而产生的歧义难以解决。如果能解决密集对应问题,各种不同的应用成为可能,特别是在计算机视觉领域。机器人导航和自动驾驶汽车需要深度感知来避开障碍物[Giachetti et al. 98, Kastrinaki et al. 03]。工业应用中的质量保证通常基于立体算法来检测裂纹