Decision-Making for Multi-View Single Object Detection with Graph Convolutional Networks

Journal of multimedia information system Pub Date : 2023-09-30 DOI:10.33851/jmis.2023.10.3.207

Ren Wang, Tae Sung Kim, Tae-Ho Lee, Jin-Sung Kim, Hyuk-Jae Lee

引用次数: 0

Abstract

Aggregating predicted outputs from multiple views helps boost multi-view single object detection performance. Decision-making strategies are flexible to perform this result-level aggregation. However, the relationship among multiple views is not exploited in aggregation. This study proposes a novel decision-making model with graph convolutional networks (DM-GCN) to address this issue by establishing a relationship among predicted outputs with graph convolutional networks. Through training, the proposed DM-GCN learns to make a correct decision by enhancing the contributions from informative views. DM-GCN is light, fast, and can be applied to any object detector with a negligible computational cost. Moreover, a real captured dataset named Yogurt10 with a new metric is proposed to investigate the performance of DM-GCN in the multi-view single object detection task. Experimental results show that DM-GCN achieves superior performance compared to classical decision-making strategies. A visual explanation is also provided to interpret how DM-GCN makes a correct decision.

查看原文本刊更多论文

基于图卷积网络的多视图单目标检测决策

聚合来自多个视图的预测输出有助于提高多视图单目标检测性能。决策策略可以灵活地执行这种结果级聚合。然而，在聚合中没有利用多个视图之间的关系。本研究提出了一种新的基于图卷积网络的决策模型(DM-GCN)，通过与图卷积网络建立预测输出之间的关系来解决这一问题。通过训练，所提出的DM-GCN通过增强信息性观点的贡献来学习做出正确的决策。DM-GCN重量轻，速度快，可以应用于任何目标检测器，计算成本可以忽略不计。此外，本文还提出了一个具有新度量的真实捕获数据集Yogurt10，以研究DM-GCN在多视图单目标检测任务中的性能。实验结果表明，与经典决策策略相比，DM-GCN策略具有更优的性能。还提供了直观的解释，以解释DM-GCN如何做出正确的决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of multimedia information system

自引率

0.00%

发文量