通过注意力聚合学习人与物体的交互

... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging Pub Date : 2021-08-06 DOI:10.1117/12.2604708

Dongzhou Gu, Shuang Cai, Shiwei Ma

{"title":"通过注意力聚合学习人与物体的交互","authors":"Dongzhou Gu, Shuang Cai, Shiwei Ma","doi":"10.1117/12.2604708","DOIUrl":null,"url":null,"abstract":"Recent years, deep neural networks have achieved impressive progress in object detection. However, detecting the interactions between objects is still challenging. Many researchers pay attention to human-object interaction (HOI) detection as a basic task in detailed scene understanding. Most conventional HOI detectors are in a two-stage manner and usually slow in inference. One-stage methods for direct parallel detection of HOI triples breaks through the limitation of object detection, but the extracted features are still insufficient. To overcome these drawbacks above, we propose an improved one-stage HOI detection approach, in which attention aggregation module and dynamic point matching strategy play key roles. The attention aggregation enhances the semantic expression ability of interaction points explicitly by aggregating contextually important information, while the matching strategy can filter the negative HOI pairs effectively in the inference stage. Extensive experiments on two challenging HOI detection benchmarks: VCOCO and HICO-DET show that our method achieves considerable performance compared to state-of-the-art performance without any additional human pose and language features.","PeriodicalId":90079,"journal":{"name":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","volume":"68 1","pages":"119130H - 119130H-5"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning human-object interactions by attention aggregation\",\"authors\":\"Dongzhou Gu, Shuang Cai, Shiwei Ma\",\"doi\":\"10.1117/12.2604708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years, deep neural networks have achieved impressive progress in object detection. However, detecting the interactions between objects is still challenging. Many researchers pay attention to human-object interaction (HOI) detection as a basic task in detailed scene understanding. Most conventional HOI detectors are in a two-stage manner and usually slow in inference. One-stage methods for direct parallel detection of HOI triples breaks through the limitation of object detection, but the extracted features are still insufficient. To overcome these drawbacks above, we propose an improved one-stage HOI detection approach, in which attention aggregation module and dynamic point matching strategy play key roles. The attention aggregation enhances the semantic expression ability of interaction points explicitly by aggregating contextually important information, while the matching strategy can filter the negative HOI pairs effectively in the inference stage. Extensive experiments on two challenging HOI detection benchmarks: VCOCO and HICO-DET show that our method achieves considerable performance compared to state-of-the-art performance without any additional human pose and language features.\",\"PeriodicalId\":90079,\"journal\":{\"name\":\"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging\",\"volume\":\"68 1\",\"pages\":\"119130H - 119130H-5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2604708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2604708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，深度神经网络在目标检测方面取得了令人瞩目的进展。然而，检测物体之间的相互作用仍然具有挑战性。人-物交互(HOI)检测作为详细场景理解的一项基本任务，受到了众多研究者的关注。大多数传统的HOI检测器采用两阶段方式，通常推理速度较慢。HOI三元组直接并行检测的单阶段方法突破了目标检测的局限性，但提取的特征仍然不足。为了克服这些缺点，我们提出了一种改进的单阶段HOI检测方法，其中注意力聚集模块和动态点匹配策略发挥了关键作用。注意聚合通过聚合上下文重要信息，显式地增强了交互点的语义表达能力，而匹配策略可以在推理阶段有效地过滤负面的HOI对。在两个具有挑战性的HOI检测基准:VCOCO和HICO-DET上进行的大量实验表明，与没有任何额外的人体姿势和语言特征的最先进性能相比，我们的方法取得了相当大的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning human-object interactions by attention aggregation

Recent years, deep neural networks have achieved impressive progress in object detection. However, detecting the interactions between objects is still challenging. Many researchers pay attention to human-object interaction (HOI) detection as a basic task in detailed scene understanding. Most conventional HOI detectors are in a two-stage manner and usually slow in inference. One-stage methods for direct parallel detection of HOI triples breaks through the limitation of object detection, but the extracted features are still insufficient. To overcome these drawbacks above, we propose an improved one-stage HOI detection approach, in which attention aggregation module and dynamic point matching strategy play key roles. The attention aggregation enhances the semantic expression ability of interaction points explicitly by aggregating contextually important information, while the matching strategy can filter the negative HOI pairs effectively in the inference stage. Extensive experiments on two challenging HOI detection benchmarks: VCOCO and HICO-DET show that our method achieves considerable performance compared to state-of-the-art performance without any additional human pose and language features.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

... International Workshop on Pattern Recognition in NeuroImaging. International Workshop on Pattern Recognition in NeuroImaging

自引率

0.00%

发文量