Samuel S. Sohn, Mihee Lee, Seonghyeon Moon, Gang Qiao, Muhammad Usman, Sejong Yoon, V. Pavlovic, Mubbasir Kapadia
{"title":"A2X:多模式人类轨迹预测的Agent和环境交互基准","authors":"Samuel S. Sohn, Mihee Lee, Seonghyeon Moon, Gang Qiao, Muhammad Usman, Sejong Yoon, V. Pavlovic, Mubbasir Kapadia","doi":"10.1145/3487983.3488302","DOIUrl":null,"url":null,"abstract":"In recent years, human trajectory prediction (HTP) has garnered attention in computer vision literature. Although this task has much in common with the longstanding task of crowd simulation, there is little from crowd simulation that has been borrowed, especially in terms of evaluation protocols. The key difference between the two tasks is that HTP is concerned with forecasting multiple steps at a time and capturing the multimodality of real human trajectories. A majority of HTP models are trained on the same few datasets, which feature small, transient interactions between real people and little to no interaction between people and the environment. Unsurprisingly, when tested on crowd egress scenarios, these models produce erroneous trajectories that accelerate too quickly and collide too frequently, but the metrics used in HTP literature cannot convey these particular issues. To address these challenges, we propose (1) the A2X dataset, which has simulated crowd egress and complex navigation scenarios that compensate for the lack of agent-to-environment interaction in existing real datasets, and (2) evaluation metrics that convey model performance with more reliability and nuance. A subset of these metrics are novel multiverse metrics, which are better-suited for multimodal models than existing metrics. The dataset is available at: https://mubbasir.github.io/HTP-benchmark/.","PeriodicalId":170509,"journal":{"name":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A2X: An Agent and Environment Interaction Benchmark for Multimodal Human Trajectory Prediction\",\"authors\":\"Samuel S. Sohn, Mihee Lee, Seonghyeon Moon, Gang Qiao, Muhammad Usman, Sejong Yoon, V. Pavlovic, Mubbasir Kapadia\",\"doi\":\"10.1145/3487983.3488302\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, human trajectory prediction (HTP) has garnered attention in computer vision literature. Although this task has much in common with the longstanding task of crowd simulation, there is little from crowd simulation that has been borrowed, especially in terms of evaluation protocols. The key difference between the two tasks is that HTP is concerned with forecasting multiple steps at a time and capturing the multimodality of real human trajectories. A majority of HTP models are trained on the same few datasets, which feature small, transient interactions between real people and little to no interaction between people and the environment. Unsurprisingly, when tested on crowd egress scenarios, these models produce erroneous trajectories that accelerate too quickly and collide too frequently, but the metrics used in HTP literature cannot convey these particular issues. To address these challenges, we propose (1) the A2X dataset, which has simulated crowd egress and complex navigation scenarios that compensate for the lack of agent-to-environment interaction in existing real datasets, and (2) evaluation metrics that convey model performance with more reliability and nuance. A subset of these metrics are novel multiverse metrics, which are better-suited for multimodal models than existing metrics. The dataset is available at: https://mubbasir.github.io/HTP-benchmark/.\",\"PeriodicalId\":170509,\"journal\":{\"name\":\"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3487983.3488302\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487983.3488302","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A2X: An Agent and Environment Interaction Benchmark for Multimodal Human Trajectory Prediction
In recent years, human trajectory prediction (HTP) has garnered attention in computer vision literature. Although this task has much in common with the longstanding task of crowd simulation, there is little from crowd simulation that has been borrowed, especially in terms of evaluation protocols. The key difference between the two tasks is that HTP is concerned with forecasting multiple steps at a time and capturing the multimodality of real human trajectories. A majority of HTP models are trained on the same few datasets, which feature small, transient interactions between real people and little to no interaction between people and the environment. Unsurprisingly, when tested on crowd egress scenarios, these models produce erroneous trajectories that accelerate too quickly and collide too frequently, but the metrics used in HTP literature cannot convey these particular issues. To address these challenges, we propose (1) the A2X dataset, which has simulated crowd egress and complex navigation scenarios that compensate for the lack of agent-to-environment interaction in existing real datasets, and (2) evaluation metrics that convey model performance with more reliability and nuance. A subset of these metrics are novel multiverse metrics, which are better-suited for multimodal models than existing metrics. The dataset is available at: https://mubbasir.github.io/HTP-benchmark/.