Hand-Object Interaction Reasoning

2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2022-01-13 DOI:10.1109/AVSS56176.2022.9959207

Jian Ma, D. Damen

引用次数: 5

Abstract

This paper proposes an interaction reasoning network for modelling spatio-temporal relationships between hands and objects in egocentric video. The proposed interaction unit utilises a Transformer-style module to reason about each acting hand, and its spatio-temporal relations to the other hand as well as objects being interacted with. We show that modelling two-handed interactions are critical for action recognition in egocentric video, and demonstrate that by using positionally-encoded trajectories, the network can better recognise observed interactions. We train and evaluate our proposed network on large-scale egocentric EPIC-KITCHENS-100 and crowd-sourced Something-Else datasets, with an ablation study to showcase our proposal.

查看原文本刊更多论文

手-物交互推理

本文提出了一种交互推理网络，用于模拟以自我为中心的视频中手与物体之间的时空关系。提议的交互单元利用一个变形金刚风格的模块来推断每只动作的手，以及它与另一只手的时空关系，以及与之交互的对象。我们证明了双手交互建模对于自我中心视频中的动作识别至关重要，并证明了通过使用位置编码的轨迹，网络可以更好地识别观察到的交互。我们在大规模以自我为中心的EPIC-KITCHENS-100和众包的Something-Else数据集上训练和评估我们提出的网络，并通过消融研究来展示我们的建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量