Cystoscopic depth estimation using gated adversarial domain adaptation.

IF 3.2 4区医学 Q2 ENGINEERING, BIOMEDICAL

Biomedical Engineering Letters Pub Date : 2023-05-01 DOI:10.1007/s13534-023-00261-3

Peter Somers, Simon Holdenried-Krafft, Johannes Zahn, Johannes Schüle, Carina Veil, Niklas Harland, Simon Walz, Arnulf Stenzl, Oliver Sawodny, Cristina Tarín, Hendrik P A Lensch

{"title":"Cystoscopic depth estimation using gated adversarial domain adaptation.","authors":"Peter Somers, Simon Holdenried-Krafft, Johannes Zahn, Johannes Schüle, Carina Veil, Niklas Harland, Simon Walz, Arnulf Stenzl, Oliver Sawodny, Cristina Tarín, Hendrik P A Lensch","doi":"10.1007/s13534-023-00261-3","DOIUrl":null,"url":null,"abstract":"<p><p>Monocular depth estimation from camera images is very important for surrounding scene evaluation in many technical fields from automotive to medicine. However, traditional triangulation methods using stereo cameras or multiple views with the assumption of a rigid environment are not applicable for endoscopic domains. Particularly in cystoscopies it is not possible to produce ground truth depth information to directly train machine learning algorithms for using a monocular image directly for depth prediction. This work considers first creating a synthetic cystoscopic environment for initial encoding of depth information from synthetically rendered images. Next, the task of predicting pixel-wise depth values for real images is constrained to a domain adaption between the synthetic and real image domains. This adaptation is done through added gated residual blocks in order to simplify the network task and maintain training stability during adversarial training. Training is done on an internally collected cystoscopy dataset from human patients. The results after training demonstrate the ability to predict reasonable depth estimations from actual cystoscopic videos and added stability from using gated residual blocks is shown to prevent mode collapse during adversarial training.</p>","PeriodicalId":46898,"journal":{"name":"Biomedical Engineering Letters","volume":"13 2","pages":"141-151"},"PeriodicalIF":3.2000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10130294/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Engineering Letters","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s13534-023-00261-3","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Monocular depth estimation from camera images is very important for surrounding scene evaluation in many technical fields from automotive to medicine. However, traditional triangulation methods using stereo cameras or multiple views with the assumption of a rigid environment are not applicable for endoscopic domains. Particularly in cystoscopies it is not possible to produce ground truth depth information to directly train machine learning algorithms for using a monocular image directly for depth prediction. This work considers first creating a synthetic cystoscopic environment for initial encoding of depth information from synthetically rendered images. Next, the task of predicting pixel-wise depth values for real images is constrained to a domain adaption between the synthetic and real image domains. This adaptation is done through added gated residual blocks in order to simplify the network task and maintain training stability during adversarial training. Training is done on an internally collected cystoscopy dataset from human patients. The results after training demonstrate the ability to predict reasonable depth estimations from actual cystoscopic videos and added stability from using gated residual blocks is shown to prevent mode collapse during adversarial training.

Abstract Image

查看原文本刊更多论文

基于门控对抗域自适应的膀胱镜深度估计。

从相机图像的单目深度估计对于从汽车到医疗等许多技术领域的周围场景评估非常重要。然而，传统的使用立体相机或假设刚性环境的多视图的三角测量方法不适用于内窥镜域。特别是在膀胱镜检查中，不可能产生真实的深度信息来直接训练机器学习算法，直接使用单眼图像进行深度预测。这项工作首先考虑创建一个合成的膀胱镜环境，用于从合成渲染图像中初始编码深度信息。接下来，预测真实图像的逐像素深度值的任务被限制在合成图像和真实图像域之间的域自适应。这种自适应是通过添加门控残差块来实现的，以简化网络任务并在对抗训练中保持训练稳定性。训练是在内部收集的人类患者膀胱镜数据集上进行的。训练后的结果表明，能够从实际的膀胱镜视频中预测合理的深度估计，并且使用门控残余块增加稳定性，可以防止对抗性训练期间的模式崩溃。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biomedical Engineering Letters ENGINEERING, BIOMEDICAL-

CiteScore

6.80

自引率

0.00%

发文量

期刊介绍： Biomedical Engineering Letters (BMEL) aims to present the innovative experimental science and technological development in the biomedical field as well as clinical application of new development. The article must contain original biomedical engineering content, defined as development, theoretical analysis, and evaluation/validation of a new technique. BMEL publishes the following types of papers: original articles, review articles, editorials, and letters to the editor. All the papers are reviewed in single-blind fashion.