Zhaofeng Niu, Yuichiro Fujimoto, M. Kanbara, H. Kato
{"title":"HMA-Depth: A New Monocular Depth Estimation Model Using Hierarchical Multi-Scale Attention","authors":"Zhaofeng Niu, Yuichiro Fujimoto, M. Kanbara, H. Kato","doi":"10.23919/MVA51890.2021.9511345","DOIUrl":null,"url":null,"abstract":"Monocular depth estimation is an essential technique for tasks like 3D reconstruction. Although many works have emerged in recent years, they can be improved by better utilizing the multi-scale information of the input images, which is proved to be one of the keys in generating high-quality depth estimations. In this paper, we propose a new monocular depth estimation method named HMA-Depth, in which we follow the encoder-decoder scheme and combine several techniques such as skip connections and the atrous spatial pyramid pooling. To obtain more precise local information from the image while keeping a good understanding of the global context, a hierarchical multi-scale attention module is adopted and its outputs are combined to generate the final output that is with both good details and good overall accuracy. Experimental results on two commonly-used datasets prove that HMA-Depth can outperform the existing approaches. Code is available11https://github.com/saranew/HMADepth.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th International Conference on Machine Vision and Applications (MVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MVA51890.2021.9511345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Monocular depth estimation is an essential technique for tasks like 3D reconstruction. Although many works have emerged in recent years, they can be improved by better utilizing the multi-scale information of the input images, which is proved to be one of the keys in generating high-quality depth estimations. In this paper, we propose a new monocular depth estimation method named HMA-Depth, in which we follow the encoder-decoder scheme and combine several techniques such as skip connections and the atrous spatial pyramid pooling. To obtain more precise local information from the image while keeping a good understanding of the global context, a hierarchical multi-scale attention module is adopted and its outputs are combined to generate the final output that is with both good details and good overall accuracy. Experimental results on two commonly-used datasets prove that HMA-Depth can outperform the existing approaches. Code is available11https://github.com/saranew/HMADepth.