Travel Time-Dependent Maximum Entropy Inverse Reinforcement Learning for Seabird Trajectory Prediction

2017 4th IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2017-11-01 DOI:10.1109/ACPR.2017.20

Tsubasa Hirakawa, Takayoshi Yamashita, K. Yoda, Toru Tamaki, H. Fujiyoshi

引用次数: 3

Abstract

Trajectory prediction is a challenging problem in the fields of computer vision, robotics, and machine learning, and a number of methods for trajectory prediction have been proposed. Most methods generate trajectories that move toward a goal in a straight line (goal-directed) while avoiding obstacles. However, there are not only such goal-directed trajectories but also trajectories that taking detours to reach the goal (non-goal-directed). In this paper, we propose a method of predicting such non-goal-directed trajectories based on the maximum entropy inverse reinforcement learning framework. Our method introduces travel time as a state of the Markov decision process. As a practical example, we apply the proposed method to seabird trajectories measured using global positioning system loggers. Experimental results show that the proposed method can effectively predict non-goal-directed trajectories.

查看原文本刊更多论文

基于旅行时变最大熵逆强化学习的海鸟轨迹预测

轨迹预测是计算机视觉、机器人技术和机器学习领域的一个具有挑战性的问题，目前已经提出了许多轨迹预测的方法。大多数方法生成的轨迹是在避开障碍的情况下沿直线(目标导向)向目标移动。然而，不仅有这样的目标导向轨迹，也有绕路达到目标的轨迹(非目标导向)。在本文中，我们提出了一种基于最大熵逆强化学习框架的非目标导向轨迹预测方法。我们的方法引入了旅行时间作为马尔可夫决策过程的一种状态。作为一个实例，我们将该方法应用于全球定位系统记录器测量的海鸟轨迹。实验结果表明，该方法可以有效地预测非目标轨迹。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)

自引率

0.00%

发文量