自监督单目深度估计的误导性监督去除机制

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Xinzhou Fan, Jinze Xu, Feng Ye, Yizong Lai
{"title":"自监督单目深度估计的误导性监督去除机制","authors":"Xinzhou Fan,&nbsp;Jinze Xu,&nbsp;Feng Ye,&nbsp;Yizong Lai","doi":"10.1016/j.displa.2025.103043","DOIUrl":null,"url":null,"abstract":"<div><div>Self-supervised monocular depth estimation leverages the photometric consistency assumption and exploits geometric relations between image frames to convert depth errors into reprojection photometric errors. This allows the model train effectively without explicit depth labels. However, due to factors such as the incomplete validity of the photometric consistency assumption, inaccurate geometric relationships between image frames, and sensor noise, there are limitations to photometric error loss, which can easily introduce inaccurate supervision information and mislead the model into local optimal solutions. To address this issue, this paper introduces a Misleading Supervision Removal Mechanism(MSRM), aimed at enhancing the accuracy of supervisory information by eliminating misleading cues. MSRM employs a composite masking strategy that incorporates both pixel-level and image-level masks, where pixel-level masks include sky masks, edge masks, and edge consistency techniques. MSRM largely eliminate misleading supervision information introduced by sky regions, edge regions, and images with low viewpoint changes. Without altering network architecture, MSRM ensures no increase in inference time, making it a plug-and-play solution. Implemented across various self-supervised monocular depth estimation algorithms, experiments on KITTI, Cityscapes, and Make3D datasets demonstrate that MSRM significantly improves the prediction accuracy and generalization performance of the original algorithms.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103043"},"PeriodicalIF":3.7000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Misleading Supervision Removal Mechanism for self-supervised monocular depth estimation\",\"authors\":\"Xinzhou Fan,&nbsp;Jinze Xu,&nbsp;Feng Ye,&nbsp;Yizong Lai\",\"doi\":\"10.1016/j.displa.2025.103043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Self-supervised monocular depth estimation leverages the photometric consistency assumption and exploits geometric relations between image frames to convert depth errors into reprojection photometric errors. This allows the model train effectively without explicit depth labels. However, due to factors such as the incomplete validity of the photometric consistency assumption, inaccurate geometric relationships between image frames, and sensor noise, there are limitations to photometric error loss, which can easily introduce inaccurate supervision information and mislead the model into local optimal solutions. To address this issue, this paper introduces a Misleading Supervision Removal Mechanism(MSRM), aimed at enhancing the accuracy of supervisory information by eliminating misleading cues. MSRM employs a composite masking strategy that incorporates both pixel-level and image-level masks, where pixel-level masks include sky masks, edge masks, and edge consistency techniques. MSRM largely eliminate misleading supervision information introduced by sky regions, edge regions, and images with low viewpoint changes. Without altering network architecture, MSRM ensures no increase in inference time, making it a plug-and-play solution. Implemented across various self-supervised monocular depth estimation algorithms, experiments on KITTI, Cityscapes, and Make3D datasets demonstrate that MSRM significantly improves the prediction accuracy and generalization performance of the original algorithms.</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"88 \",\"pages\":\"Article 103043\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225000800\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225000800","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

自监督单目深度估计利用光度一致性假设,利用图像帧之间的几何关系将深度误差转化为重投影光度误差。这允许模型在没有显式深度标签的情况下有效地训练。然而,由于光度一致性假设的不完全有效性、图像帧之间的几何关系不准确以及传感器噪声等因素,光度误差损失存在局限性,容易引入不准确的监督信息,使模型陷入局部最优解。为了解决这一问题,本文引入了一种误导性监督去除机制(MSRM),旨在通过消除误导性线索来提高监管信息的准确性。MSRM采用了一种复合蒙版策略,该策略结合了像素级和图像级蒙版,其中像素级蒙版包括天空蒙版、边缘蒙版和边缘一致性技术。MSRM很大程度上消除了天空区域、边缘区域和低视点变化图像引入的误导性监管信息。在不改变网络架构的情况下,MSRM确保不增加推理时间,使其成为即插即用的解决方案。在KITTI、cityscape和Make3D数据集上的实验表明,MSRM显著提高了原始算法的预测精度和泛化性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Misleading Supervision Removal Mechanism for self-supervised monocular depth estimation
Self-supervised monocular depth estimation leverages the photometric consistency assumption and exploits geometric relations between image frames to convert depth errors into reprojection photometric errors. This allows the model train effectively without explicit depth labels. However, due to factors such as the incomplete validity of the photometric consistency assumption, inaccurate geometric relationships between image frames, and sensor noise, there are limitations to photometric error loss, which can easily introduce inaccurate supervision information and mislead the model into local optimal solutions. To address this issue, this paper introduces a Misleading Supervision Removal Mechanism(MSRM), aimed at enhancing the accuracy of supervisory information by eliminating misleading cues. MSRM employs a composite masking strategy that incorporates both pixel-level and image-level masks, where pixel-level masks include sky masks, edge masks, and edge consistency techniques. MSRM largely eliminate misleading supervision information introduced by sky regions, edge regions, and images with low viewpoint changes. Without altering network architecture, MSRM ensures no increase in inference time, making it a plug-and-play solution. Implemented across various self-supervised monocular depth estimation algorithms, experiments on KITTI, Cityscapes, and Make3D datasets demonstrate that MSRM significantly improves the prediction accuracy and generalization performance of the original algorithms.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信