通过双像素传感器进行弱监督深度估计和图像去毛刺

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2024-09-12 DOI:10.1109/TPAMI.2024.3458974

Liyuan Pan;Richard Hartley;Liu Liu;Zhiwei Xu;Shah Chowdhury;Yan Yang;Hongguang Zhang;Hongdong Li;Miaomiao Liu

{"title":"通过双像素传感器进行弱监督深度估计和图像去毛刺","authors":"Liyuan Pan;Richard Hartley;Liu Liu;Zhiwei Xu;Shah Chowdhury;Yan Yang;Hongguang Zhang;Hongdong Li;Miaomiao Liu","doi":"10.1109/TPAMI.2024.3458974","DOIUrl":null,"url":null,"abstract":"Dual-pixel (DP) imaging sensors are getting more popularly adopted by modern cameras. A DP camera captures a pair of images in a single snapshot by splitting each pixel in half. Several previous studies show how to recover depth information by treating the DP pair as an approximate stereo pair. However, dual-pixel disparity occurs only in image regions with defocus blur which is unlike classic stereo disparity. Heavy defocus blur in DP pairs affects the performance of depth estimation approaches based on matching. Therefore, we treat the blur removal and the depth estimation as a joint problem. We investigate the formation of the DP pair, which links the blur and depth information, rather than blindly removing the blur effect. We propose a mathematical DP model that can improve depth estimation by the blur. This exploration motivated us to propose our previous work, an end-to-end DDDNet (DP-based Depth and Deblur Network), which jointly estimates depth and restores the image in a supervised fashion. However, collecting the ground-truth (GT) depth map for the DP pair is challenging and limits the depth estimation potential of the DP sensor. Therefore, we propose an extension of the DDDNet, called WDDNet (Weakly-supervised Depth and Deblur Network), which includes an efficient reblur solver that does not require GT depth maps for training. To achieve this, we convert all-in-focus images into supervisory signals for unsupervised depth estimation in our WDDNet. We jointly estimate an all-in-focus image and a disparity map, then use a \n<italic>Reblur</i>\n and \n<italic>Fstack</i>\n module to regularize the disparity estimation and image restoration. We conducted extensive experiments on synthetic and real data to demonstrate the competitive performance of our method when compared to state-of-the-art (SOTA) supervised approaches.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"11314-11330"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weakly-Supervised Depth Estimation and Image Deblurring via Dual-Pixel Sensors\",\"authors\":\"Liyuan Pan;Richard Hartley;Liu Liu;Zhiwei Xu;Shah Chowdhury;Yan Yang;Hongguang Zhang;Hongdong Li;Miaomiao Liu\",\"doi\":\"10.1109/TPAMI.2024.3458974\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dual-pixel (DP) imaging sensors are getting more popularly adopted by modern cameras. A DP camera captures a pair of images in a single snapshot by splitting each pixel in half. Several previous studies show how to recover depth information by treating the DP pair as an approximate stereo pair. However, dual-pixel disparity occurs only in image regions with defocus blur which is unlike classic stereo disparity. Heavy defocus blur in DP pairs affects the performance of depth estimation approaches based on matching. Therefore, we treat the blur removal and the depth estimation as a joint problem. We investigate the formation of the DP pair, which links the blur and depth information, rather than blindly removing the blur effect. We propose a mathematical DP model that can improve depth estimation by the blur. This exploration motivated us to propose our previous work, an end-to-end DDDNet (DP-based Depth and Deblur Network), which jointly estimates depth and restores the image in a supervised fashion. However, collecting the ground-truth (GT) depth map for the DP pair is challenging and limits the depth estimation potential of the DP sensor. Therefore, we propose an extension of the DDDNet, called WDDNet (Weakly-supervised Depth and Deblur Network), which includes an efficient reblur solver that does not require GT depth maps for training. To achieve this, we convert all-in-focus images into supervisory signals for unsupervised depth estimation in our WDDNet. We jointly estimate an all-in-focus image and a disparity map, then use a \\n<italic>Reblur</i>\\n and \\n<italic>Fstack</i>\\n module to regularize the disparity estimation and image restoration. We conducted extensive experiments on synthetic and real data to demonstrate the competitive performance of our method when compared to state-of-the-art (SOTA) supervised approaches.\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"46 12\",\"pages\":\"11314-11330\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10679095/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10679095/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

双像素（DP）成像传感器越来越多地被现代相机所采用。DP 相机通过将每个像素分成两半，在一张快照中捕捉一对图像。之前的一些研究表明了如何通过将 DP 像对视为近似立体像对来恢复深度信息。然而，双像素差异只发生在有散焦模糊的图像区域，这与传统的立体差异不同。DP 对中严重的虚焦模糊会影响基于匹配的深度估计方法的性能。因此，我们将消除模糊和深度估计作为一个联合问题来处理。我们研究了 DP 对的形成，它将模糊和深度信息联系在一起，而不是盲目地消除模糊效应。我们提出了一个数学 DP 模型，该模型可以通过模糊改善深度估计。这一探索促使我们提出了之前的工作成果--端到端 DDDNet（基于 DP 的深度和去模糊网络），它以监督的方式联合估计深度并还原图像。然而，收集 DP 对的地面实况（GT）深度图具有挑战性，限制了 DP 传感器的深度估计潜力。因此，我们提出了 DDDNet 的扩展，称为 WDDNet（弱监督深度和去模糊网络），其中包括一个高效的去模糊求解器，不需要 GT 深度图进行训练。为此，我们在 WDDNet 中将全焦图像转换为无监督深度估计的监督信号。我们联合估计全焦图像和差异图，然后使用 Reblur 和 Fstack 模块对差异估计和图像复原进行正则化。我们在合成数据和真实数据上进行了大量实验，以证明我们的方法与最先进的（SOTA）监督方法相比具有极强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Weakly-Supervised Depth Estimation and Image Deblurring via Dual-Pixel Sensors

Dual-pixel (DP) imaging sensors are getting more popularly adopted by modern cameras. A DP camera captures a pair of images in a single snapshot by splitting each pixel in half. Several previous studies show how to recover depth information by treating the DP pair as an approximate stereo pair. However, dual-pixel disparity occurs only in image regions with defocus blur which is unlike classic stereo disparity. Heavy defocus blur in DP pairs affects the performance of depth estimation approaches based on matching. Therefore, we treat the blur removal and the depth estimation as a joint problem. We investigate the formation of the DP pair, which links the blur and depth information, rather than blindly removing the blur effect. We propose a mathematical DP model that can improve depth estimation by the blur. This exploration motivated us to propose our previous work, an end-to-end DDDNet (DP-based Depth and Deblur Network), which jointly estimates depth and restores the image in a supervised fashion. However, collecting the ground-truth (GT) depth map for the DP pair is challenging and limits the depth estimation potential of the DP sensor. Therefore, we propose an extension of the DDDNet, called WDDNet (Weakly-supervised Depth and Deblur Network), which includes an efficient reblur solver that does not require GT depth maps for training. To achieve this, we convert all-in-focus images into supervisory signals for unsupervised depth estimation in our WDDNet. We jointly estimate an all-in-focus image and a disparity map, then use a Reblur and Fstack module to regularize the disparity estimation and image restoration. We conducted extensive experiments on synthetic and real data to demonstrate the competitive performance of our method when compared to state-of-the-art (SOTA) supervised approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量