基于可学习赤道偏差的全向显著性图多尺度估计

IF 0.6 4区计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEICE Transactions on Information and Systems Pub Date : 2023-10-01 DOI:10.1587/transinf.2023edp7055

Takao YAMANAKA, Tatsuya SUZUKI, Taiki NOBUTSUNE, Chenjunlin WU

{"title":"基于可学习赤道偏差的全向显著性图多尺度估计","authors":"Takao YAMANAKA, Tatsuya SUZUKI, Taiki NOBUTSUNE, Chenjunlin WU","doi":"10.1587/transinf.2023edp7055","DOIUrl":null,"url":null,"abstract":"Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.","PeriodicalId":55002,"journal":{"name":"IEICE Transactions on Information and Systems","volume":"150 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Scale Estimation for Omni-Directional Saliency Maps Using Learnable Equator Bias\",\"authors\":\"Takao YAMANAKA, Tatsuya SUZUKI, Taiki NOBUTSUNE, Chenjunlin WU\",\"doi\":\"10.1587/transinf.2023edp7055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.\",\"PeriodicalId\":55002,\"journal\":{\"name\":\"IEICE Transactions on Information and Systems\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEICE Transactions on Information and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1587/transinf.2023edp7055\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEICE Transactions on Information and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1587/transinf.2023edp7055","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

全方位图像已被广泛应用于虚拟/增强现实、自动驾驶汽车、机器人模拟器和监控系统等领域。对于这些应用，使用头戴式显示器估计表示凝视点概率分布的显著性地图，以检测全向图像中的重要区域将是有用的。本文提出了一种新的全向图像显著性图估计模型，该模型通过从不同视角和方向的全向图像中提取重叠的二维平面图像。虽然2D显着性地图往往在图像中心具有高概率(中心偏差)，但当使用头戴式显示器时，高概率区域出现在全方位显着性地图的水平方向(赤道偏差)。因此，利用全向数据集对具有中心偏置层的二维显著性模型进行微调，将中心偏置层替换为以仰角为条件的赤道偏置层，提取二维平面图像。在显著性数据集中，全向图像的可用性有限，可以通过使用大量训练图像和2D显著性图的基础真值进行预训练而得到完善的2D显著性模型来弥补。此外，本文提出了一种多尺度估计方法，通过提取多个视角的二维图像来检测不同大小、不同感受域的目标。通过在集成层中计算逐像素的注意力权重，对多个视角估计的显著性图进行集成，并对每个目标进行最优比例尺加权。使用公开可用的数据集对所提出的方法进行了评估，该数据集具有全向显著性地图的评估指标。实验结果表明，该方法提高了显著性图的精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-Scale Estimation for Omni-Directional Saliency Maps Using Learnable Equator Bias

Omni-directional images have been used in wide range of applications including virtual/augmented realities, self-driving cars, robotics simulators, and surveillance systems. For these applications, it would be useful to estimate saliency maps representing probability distributions of gazing points with a head-mounted display, to detect important regions in the omni-directional images. This paper proposes a novel saliency-map estimation model for the omni-directional images by extracting overlapping 2-dimensional (2D) plane images from omni-directional images at various directions and angles of view. While 2D saliency maps tend to have high probability at the center of images (center bias), the high-probability region appears at horizontal directions in omni-directional saliency maps when a head-mounted display is used (equator bias). Therefore, the 2D saliency model with a center-bias layer was fine-tuned with an omni-directional dataset by replacing the center-bias layer to an equator-bias layer conditioned on the elevation angle for the extraction of the 2D plane image. The limited availability of omni-directional images in saliency datasets can be compensated by using the well-established 2D saliency model pretrained by a large number of training images with the ground truth of 2D saliency maps. In addition, this paper proposes a multi-scale estimation method by extracting 2D images in multiple angles of view to detect objects of various sizes with variable receptive fields. The saliency maps estimated from the multiple angles of view were integrated by using pixel-wise attention weights calculated in an integration layer for weighting the optimal scale to each object. The proposed method was evaluated using a publicly available dataset with evaluation metrics for omni-directional saliency maps. It was confirmed that the accuracy of the saliency maps was improved by the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEICE Transactions on Information and Systems 工程技术-计算机：软件工程

CiteScore

1.80

自引率

0.00%

发文量

238

审稿时长

5.0 months

期刊介绍： Published by The Institute of Electronics, Information and Communication Engineers Subject Area: Mathematics Physics Biology, Life Sciences and Basic Medicine General Medicine, Social Medicine, and Nursing Sciences Clinical Medicine Engineering in General Nanosciences and Materials Sciences Mechanical Engineering Electrical and Electronic Engineering Information Sciences Economics, Business & Management Psychology, Education.