A2X: An Agent and Environment Interaction Benchmark for Multimodal Human Trajectory Prediction

Samuel S. Sohn, Mihee Lee, Seonghyeon Moon, Gang Qiao, Muhammad Usman, Sejong Yoon, V. Pavlovic, Mubbasir Kapadia
{"title":"A2X: An Agent and Environment Interaction Benchmark for Multimodal Human Trajectory Prediction","authors":"Samuel S. Sohn, Mihee Lee, Seonghyeon Moon, Gang Qiao, Muhammad Usman, Sejong Yoon, V. Pavlovic, Mubbasir Kapadia","doi":"10.1145/3487983.3488302","DOIUrl":null,"url":null,"abstract":"In recent years, human trajectory prediction (HTP) has garnered attention in computer vision literature. Although this task has much in common with the longstanding task of crowd simulation, there is little from crowd simulation that has been borrowed, especially in terms of evaluation protocols. The key difference between the two tasks is that HTP is concerned with forecasting multiple steps at a time and capturing the multimodality of real human trajectories. A majority of HTP models are trained on the same few datasets, which feature small, transient interactions between real people and little to no interaction between people and the environment. Unsurprisingly, when tested on crowd egress scenarios, these models produce erroneous trajectories that accelerate too quickly and collide too frequently, but the metrics used in HTP literature cannot convey these particular issues. To address these challenges, we propose (1) the A2X dataset, which has simulated crowd egress and complex navigation scenarios that compensate for the lack of agent-to-environment interaction in existing real datasets, and (2) evaluation metrics that convey model performance with more reliability and nuance. A subset of these metrics are novel multiverse metrics, which are better-suited for multimodal models than existing metrics. The dataset is available at: https://mubbasir.github.io/HTP-benchmark/.","PeriodicalId":170509,"journal":{"name":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487983.3488302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

In recent years, human trajectory prediction (HTP) has garnered attention in computer vision literature. Although this task has much in common with the longstanding task of crowd simulation, there is little from crowd simulation that has been borrowed, especially in terms of evaluation protocols. The key difference between the two tasks is that HTP is concerned with forecasting multiple steps at a time and capturing the multimodality of real human trajectories. A majority of HTP models are trained on the same few datasets, which feature small, transient interactions between real people and little to no interaction between people and the environment. Unsurprisingly, when tested on crowd egress scenarios, these models produce erroneous trajectories that accelerate too quickly and collide too frequently, but the metrics used in HTP literature cannot convey these particular issues. To address these challenges, we propose (1) the A2X dataset, which has simulated crowd egress and complex navigation scenarios that compensate for the lack of agent-to-environment interaction in existing real datasets, and (2) evaluation metrics that convey model performance with more reliability and nuance. A subset of these metrics are novel multiverse metrics, which are better-suited for multimodal models than existing metrics. The dataset is available at: https://mubbasir.github.io/HTP-benchmark/.
A2X:多模式人类轨迹预测的Agent和环境交互基准
近年来,人类轨迹预测(HTP)在计算机视觉文献中引起了广泛的关注。尽管这一任务与长期存在的人群模拟任务有很多共同之处,但从人群模拟中借鉴的东西很少,特别是在评估协议方面。这两个任务之间的关键区别在于,HTP关注的是一次预测多个步骤,并捕捉真实人类轨迹的多模态。大多数http模型都是在相同的几个数据集上训练的,这些数据集的特点是真人之间的小而短暂的交互,而人与环境之间几乎没有交互。不出所料,当对人群出口场景进行测试时,这些模型产生了错误的轨迹,加速太快,碰撞太频繁,但http文献中使用的指标无法传达这些特定问题。为了应对这些挑战,我们提出(1)A2X数据集,该数据集模拟了人群出口和复杂的导航场景,弥补了现有真实数据集中缺乏代理与环境交互的不足;(2)评估指标,以更高的可靠性和细微差别传达模型性能。这些度量的一个子集是新的多重度量,它比现有的度量更适合于多模态模型。该数据集可从https://mubbasir.github.io/HTP-benchmark/获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信