{"title":"QRNet: Quaternion-Based Refinement Network for Surface Normal Estimation","authors":"Hanlin Bai;Xin Gao;Wei Deng;Jianwang Gan;Yijin Xiong;Kangkang Kou;Guoying Zhang","doi":"10.1109/TMM.2025.3535299","DOIUrl":null,"url":null,"abstract":"In recent years, there has been a notable increase in interest in image-based surface normal estimation. These approaches are capable of predicting the surface normal of real scenes using only an image, thereby facilitating a more profound comprehension of the actual scene and providing assistance with other perceptual tasks. However, dense regression predictions are susceptible to misdirection when encountering intricate details, which presents a paradoxical challenge for image-based surface normal estimation in reconciling detail and density. By introducing quaternion rotations as fusion module with geometric property, we propose a quaternion-based refined network structure that fuses detailed and structural information. Specifically, we design a high-resolution surface normal baseline with a streamlined structure, to extract fine-grained features while reducing the angular error in surface normal regression values caused by downsampling. Additionally, we propose a subtle angle loss function that prevents subtle changes from being overlooked without extra information, further enhancing the model's ability to learn detailed information. The proposed method demonstrates state-of-the-art performance compared to existing techniques on three real-world datasets comprising indoor and outdoor scenes. The results demonstrate the robust effectiveness of our deep learning approach that incorporates geometric prior guidance, highlighting improved robustness in applying deep learning methods. The source code will be released upon acceptance.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"3356-3369"},"PeriodicalIF":9.7000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858747/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, there has been a notable increase in interest in image-based surface normal estimation. These approaches are capable of predicting the surface normal of real scenes using only an image, thereby facilitating a more profound comprehension of the actual scene and providing assistance with other perceptual tasks. However, dense regression predictions are susceptible to misdirection when encountering intricate details, which presents a paradoxical challenge for image-based surface normal estimation in reconciling detail and density. By introducing quaternion rotations as fusion module with geometric property, we propose a quaternion-based refined network structure that fuses detailed and structural information. Specifically, we design a high-resolution surface normal baseline with a streamlined structure, to extract fine-grained features while reducing the angular error in surface normal regression values caused by downsampling. Additionally, we propose a subtle angle loss function that prevents subtle changes from being overlooked without extra information, further enhancing the model's ability to learn detailed information. The proposed method demonstrates state-of-the-art performance compared to existing techniques on three real-world datasets comprising indoor and outdoor scenes. The results demonstrate the robust effectiveness of our deep learning approach that incorporates geometric prior guidance, highlighting improved robustness in applying deep learning methods. The source code will be released upon acceptance.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.