Relationship graph learning network for visual relationship detection

Proceedings of the 2nd ACM International Conference on Multimedia in Asia Pub Date : 2021-03-07 DOI:10.1145/3444685.3446312

Yanan Li, Jun Yu, Yibing Zhan, Zhi Chen

{"title":"Relationship graph learning network for visual relationship detection","authors":"Yanan Li, Jun Yu, Yibing Zhan, Zhi Chen","doi":"10.1145/3444685.3446312","DOIUrl":null,"url":null,"abstract":"Visual relationship detection aims to predict the relationships between detected object pairs. It is well believed that the correlations between image components (i.e., objects and relationships between objects) are significant considerations when predicting objects' relationships. However, most current visual relationship detection methods only exploited the correlations among objects, and the correlations among objects' relationships remained underexplored. This paper proposes a relationship graph learning network (RGLN) to explore the correlations among objects' relationships for visual relationship detection. Specifically, RGLN obtains image objects using an object detector, and then, every pair of objects constitutes a relationship proposal. All relationship proposals construct a relationship graph, in which the proposals are treated as nodes. Accordingly, RGLN designs bi-stream graph attention subnetworks to detect relationship proposals, in which one graph attention subnetwork analyzes correlations among relationships based on visual and spatial information, and the other analyzes correlations based on semantic and spatial information. Besides, RGLN exploits a relationship selection subnetwork to ignore redundant information of object pairs with no relationships. We conduct extensive experiments on two public datasets: the VRD and the VG datasets. The experimental results compared with the state-of-the-art demonstrate the competitiveness of RGLN.","PeriodicalId":119278,"journal":{"name":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444685.3446312","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Visual relationship detection aims to predict the relationships between detected object pairs. It is well believed that the correlations between image components (i.e., objects and relationships between objects) are significant considerations when predicting objects' relationships. However, most current visual relationship detection methods only exploited the correlations among objects, and the correlations among objects' relationships remained underexplored. This paper proposes a relationship graph learning network (RGLN) to explore the correlations among objects' relationships for visual relationship detection. Specifically, RGLN obtains image objects using an object detector, and then, every pair of objects constitutes a relationship proposal. All relationship proposals construct a relationship graph, in which the proposals are treated as nodes. Accordingly, RGLN designs bi-stream graph attention subnetworks to detect relationship proposals, in which one graph attention subnetwork analyzes correlations among relationships based on visual and spatial information, and the other analyzes correlations based on semantic and spatial information. Besides, RGLN exploits a relationship selection subnetwork to ignore redundant information of object pairs with no relationships. We conduct extensive experiments on two public datasets: the VRD and the VG datasets. The experimental results compared with the state-of-the-art demonstrate the competitiveness of RGLN.

查看原文本刊更多论文

用于视觉关系检测的关系图学习网络

视觉关系检测的目的是预测被检测对象对之间的关系。人们普遍认为，图像组件之间的相关性(即对象和对象之间的关系)是预测对象关系时的重要考虑因素。然而，目前大多数视觉关系检测方法只利用了物体之间的相关性，对物体之间的相关性的研究还不够充分。本文提出了一种关系图学习网络(RGLN)来探索对象之间关系的相关性，用于视觉关系检测。具体来说，RGLN使用对象检测器获得图像对象，然后，每对对象构成一个关系建议。所有的关系建议都构建一个关系图，其中的建议被视为节点。因此，RGLN设计了双流图注意子网来检测关系建议，其中一个图注意子网基于视觉和空间信息分析关系之间的相关性，另一个图注意子网基于语义和空间信息分析关系之间的相关性。此外，RGLN利用关系选择子网来忽略没有关系的对象对的冗余信息。我们在两个公共数据集上进行了大量的实验:VRD和VG数据集。实验结果表明，RGLN具有较强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2nd ACM International Conference on Multimedia in Asia

自引率

0.00%

发文量