QRNet: Quaternion-Based Refinement Network for Surface Normal Estimation

IF 9.7 1区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Hanlin Bai;Xin Gao;Wei Deng;Jianwang Gan;Yijin Xiong;Kangkang Kou;Guoying Zhang
{"title":"QRNet: Quaternion-Based Refinement Network for Surface Normal Estimation","authors":"Hanlin Bai;Xin Gao;Wei Deng;Jianwang Gan;Yijin Xiong;Kangkang Kou;Guoying Zhang","doi":"10.1109/TMM.2025.3535299","DOIUrl":null,"url":null,"abstract":"In recent years, there has been a notable increase in interest in image-based surface normal estimation. These approaches are capable of predicting the surface normal of real scenes using only an image, thereby facilitating a more profound comprehension of the actual scene and providing assistance with other perceptual tasks. However, dense regression predictions are susceptible to misdirection when encountering intricate details, which presents a paradoxical challenge for image-based surface normal estimation in reconciling detail and density. By introducing quaternion rotations as fusion module with geometric property, we propose a quaternion-based refined network structure that fuses detailed and structural information. Specifically, we design a high-resolution surface normal baseline with a streamlined structure, to extract fine-grained features while reducing the angular error in surface normal regression values caused by downsampling. Additionally, we propose a subtle angle loss function that prevents subtle changes from being overlooked without extra information, further enhancing the model's ability to learn detailed information. The proposed method demonstrates state-of-the-art performance compared to existing techniques on three real-world datasets comprising indoor and outdoor scenes. The results demonstrate the robust effectiveness of our deep learning approach that incorporates geometric prior guidance, highlighting improved robustness in applying deep learning methods. The source code will be released upon acceptance.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"3356-3369"},"PeriodicalIF":9.7000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858747/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, there has been a notable increase in interest in image-based surface normal estimation. These approaches are capable of predicting the surface normal of real scenes using only an image, thereby facilitating a more profound comprehension of the actual scene and providing assistance with other perceptual tasks. However, dense regression predictions are susceptible to misdirection when encountering intricate details, which presents a paradoxical challenge for image-based surface normal estimation in reconciling detail and density. By introducing quaternion rotations as fusion module with geometric property, we propose a quaternion-based refined network structure that fuses detailed and structural information. Specifically, we design a high-resolution surface normal baseline with a streamlined structure, to extract fine-grained features while reducing the angular error in surface normal regression values caused by downsampling. Additionally, we propose a subtle angle loss function that prevents subtle changes from being overlooked without extra information, further enhancing the model's ability to learn detailed information. The proposed method demonstrates state-of-the-art performance compared to existing techniques on three real-world datasets comprising indoor and outdoor scenes. The results demonstrate the robust effectiveness of our deep learning approach that incorporates geometric prior guidance, highlighting improved robustness in applying deep learning methods. The source code will be released upon acceptance.
QRNet:基于四元数的曲面法向估计的细化网络
近年来,人们对基于图像的表面法向估计的兴趣显著增加。这些方法能够仅使用图像预测真实场景的表面法线,从而促进对实际场景的更深刻理解,并为其他感知任务提供帮助。然而,密集回归预测在遇到复杂的细节时容易出错,这对基于图像的表面法向估计在协调细节和密度方面提出了矛盾的挑战。通过引入四元数旋转作为具有几何特性的融合模块,提出了一种基于四元数的精细网络结构,融合了细节信息和结构信息。具体而言,我们设计了一个具有流线型结构的高分辨率表面法向基线,以提取细粒度特征,同时减少下采样引起的表面法向回归值的角度误差。此外,我们提出了一个微妙的角度损失函数,防止在没有额外信息的情况下忽略微妙的变化,进一步增强了模型学习细节信息的能力。与现有技术相比,所提出的方法在包含室内和室外场景的三个真实数据集上展示了最先进的性能。结果证明了我们的深度学习方法的鲁棒性有效性,该方法结合了几何先验指导,突出了应用深度学习方法的鲁棒性。源代码将在接受后发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Multimedia
IEEE Transactions on Multimedia 工程技术-电信学
CiteScore
11.70
自引率
11.00%
发文量
576
审稿时长
5.5 months
期刊介绍: The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信