基于单张鱼眼图像的失真感知房间布局估计

Ming Meng, Likai Xiao, Yi Zhou, Zhao-Qing Li, Zhong Zhou
{"title":"基于单张鱼眼图像的失真感知房间布局估计","authors":"Ming Meng, Likai Xiao, Yi Zhou, Zhao-Qing Li, Zhong Zhou","doi":"10.1109/ismar52148.2021.00061","DOIUrl":null,"url":null,"abstract":"Omnidirectional images of 180° or 360° field of view provide the entire visual content around the capture cameras, giving rise to more sophisticated scene understanding and reasoning and bringing broad application prospects for VR/AR/MR. As a result, researches on omnidirectional image layout estimation have sprung up in recent years. However, existing layout estimation methods designed for panorama images cannot perform well on fisheye images, mainly due to lack of public fisheye dataset as well as the significantly differences in the positions and degree of distortions caused by different projection models. To fill theses gaps, in this work we first reuse the released large-scale panorama datasets and reproduce them to fisheye images via projection conversion, thereby circumventing the challenge of obtaining high-quality fisheye datasets with ground truth layout annotations. Then, we propose a distortion-aware module according to the distortion of the orthographic projection (i.e., OrthConv) to perform effective features extraction from fisheye images. Additionally, we exploit bidirectional LSTM with two-dimensional step mode for horizontal and vertical prediction to capture the long-range geometric pattern of the object for the global coherent predictions even with occlusion and cluttered scenes. We extensively evaluate our deformable convolution for room layout estimation task. In comparison with state-of-the-art approaches, our approach produces considerable performance gains in real-world dataset as well as in synthetic dataset. This technology provides high-efficiency and low-cost technical implementations for VR house viewing and MR video surveillance. We present an MR-based building video surveillance scene equipped with nine fisheye lens can achieve an immersive hybrid display experience, which can be used for intelligent building management in the future.","PeriodicalId":395413,"journal":{"name":"2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Distortion-Aware Room Layout Estimation from A Single Fisheye Image\",\"authors\":\"Ming Meng, Likai Xiao, Yi Zhou, Zhao-Qing Li, Zhong Zhou\",\"doi\":\"10.1109/ismar52148.2021.00061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Omnidirectional images of 180° or 360° field of view provide the entire visual content around the capture cameras, giving rise to more sophisticated scene understanding and reasoning and bringing broad application prospects for VR/AR/MR. As a result, researches on omnidirectional image layout estimation have sprung up in recent years. However, existing layout estimation methods designed for panorama images cannot perform well on fisheye images, mainly due to lack of public fisheye dataset as well as the significantly differences in the positions and degree of distortions caused by different projection models. To fill theses gaps, in this work we first reuse the released large-scale panorama datasets and reproduce them to fisheye images via projection conversion, thereby circumventing the challenge of obtaining high-quality fisheye datasets with ground truth layout annotations. Then, we propose a distortion-aware module according to the distortion of the orthographic projection (i.e., OrthConv) to perform effective features extraction from fisheye images. Additionally, we exploit bidirectional LSTM with two-dimensional step mode for horizontal and vertical prediction to capture the long-range geometric pattern of the object for the global coherent predictions even with occlusion and cluttered scenes. We extensively evaluate our deformable convolution for room layout estimation task. In comparison with state-of-the-art approaches, our approach produces considerable performance gains in real-world dataset as well as in synthetic dataset. This technology provides high-efficiency and low-cost technical implementations for VR house viewing and MR video surveillance. We present an MR-based building video surveillance scene equipped with nine fisheye lens can achieve an immersive hybrid display experience, which can be used for intelligent building management in the future.\",\"PeriodicalId\":395413,\"journal\":{\"name\":\"2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ismar52148.2021.00061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ismar52148.2021.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

180°或360°视场的全向图像提供了捕捉相机周围的整个视觉内容,使场景理解和推理更加复杂,为VR/AR/MR带来了广阔的应用前景。因此,近年来对全向图像布局估计的研究如雨后春笋般涌现。然而,现有的全景图像布局估计方法在鱼眼图像上的效果并不理想,主要原因是缺乏公开的鱼眼数据集,以及不同投影模型导致的位置和失真程度存在显著差异。为了填补这些空白,在这项工作中,我们首先重用已发布的大规模全景数据集,并通过投影转换将其复制为鱼眼图像,从而绕过了使用地面真值布局注释获得高质量鱼眼数据集的挑战。然后,我们提出了一种基于正射影失真的畸变感知模块(即OrthConv),对鱼眼图像进行有效的特征提取。此外,我们利用双向LSTM的二维阶跃模式进行水平和垂直预测,即使在遮挡和混乱的场景下也能捕获目标的远距离几何图案进行全局相干预测。我们广泛地评估了用于房间布局估计任务的可变形卷积。与最先进的方法相比,我们的方法在真实数据集和合成数据集中都产生了相当大的性能提升。该技术为VR看房和MR视频监控提供了高效、低成本的技术实现。我们提出了一种基于核磁共振的建筑视频监控场景,配备9个鱼眼镜头,可以实现沉浸式混合显示体验,可用于未来的智能建筑管理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Distortion-Aware Room Layout Estimation from A Single Fisheye Image
Omnidirectional images of 180° or 360° field of view provide the entire visual content around the capture cameras, giving rise to more sophisticated scene understanding and reasoning and bringing broad application prospects for VR/AR/MR. As a result, researches on omnidirectional image layout estimation have sprung up in recent years. However, existing layout estimation methods designed for panorama images cannot perform well on fisheye images, mainly due to lack of public fisheye dataset as well as the significantly differences in the positions and degree of distortions caused by different projection models. To fill theses gaps, in this work we first reuse the released large-scale panorama datasets and reproduce them to fisheye images via projection conversion, thereby circumventing the challenge of obtaining high-quality fisheye datasets with ground truth layout annotations. Then, we propose a distortion-aware module according to the distortion of the orthographic projection (i.e., OrthConv) to perform effective features extraction from fisheye images. Additionally, we exploit bidirectional LSTM with two-dimensional step mode for horizontal and vertical prediction to capture the long-range geometric pattern of the object for the global coherent predictions even with occlusion and cluttered scenes. We extensively evaluate our deformable convolution for room layout estimation task. In comparison with state-of-the-art approaches, our approach produces considerable performance gains in real-world dataset as well as in synthetic dataset. This technology provides high-efficiency and low-cost technical implementations for VR house viewing and MR video surveillance. We present an MR-based building video surveillance scene equipped with nine fisheye lens can achieve an immersive hybrid display experience, which can be used for intelligent building management in the future.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信