Intention-guided imitation learning methods under limited expert demonstration data

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-09-17 DOI:10.1016/j.knosys.2025.114455

Yilin Liu, Xiangfeng Luo, Shaorong Xie

{"title":"Intention-guided imitation learning methods under limited expert demonstration data","authors":"Yilin Liu, Xiangfeng Luo, Shaorong Xie","doi":"10.1016/j.knosys.2025.114455","DOIUrl":null,"url":null,"abstract":"<div><div>Imitation Learning has achieved significant results in various fields, such as robot control, autonomous driving, and unmanned vessel decision-making. This technology aims to mimic human behavior in specific tasks by learning the mapping between states and actions, enabling agents to execute tasks based on demonstrations. However, these methods rely on the acquisition of high-quality demonstration data, facing challenges such as difficulties in collecting expert samples, high costs, and low efficiency in policy learning. Particularly under limited sample conditions, imitation learning can easily get trapped in local optima, leading to lower success rates and accuracy in decision-making. Researchers have used data augmentation and transfer learning to tackle limited data. However, in complex scenarios, these methods are less effective due to a lack of domain-specific knowledge, which affects the interpretability of the model. To address these challenges, we propose an Intention-guided Imitation Learning method under limited expert demonstration data (ITIL), which extracts deep intent features from a small number of samples to enhance the agent’s understanding of the scene and improve the accuracy of the mapping from states to actions during Imitation Learning. Specifically, the core method consists of three modules: (1) Semantic Enhancement Module, which extracts spatiotemporal feature maps from a small number of raw trajectories to enrich the semantic information of expert data; (2) Intention Expression Module, which constructs an intention tree network to establish connections between different levels, effectively expressing and capturing expert intent; (3) Strategy Generation Module, which integrates the outputs of the first two modules as input to form efficient decision-making, creating a closed-loop architecture of cognitive understanding-knowledge expression-decision optimization. Experimental results show that our model outperforms baseline methods in navigation, capture, and formation tasks, with an average success rate improvement of approximately +6 % compared to the baseline method (ValueDICE).</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114455"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125014947","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Imitation Learning has achieved significant results in various fields, such as robot control, autonomous driving, and unmanned vessel decision-making. This technology aims to mimic human behavior in specific tasks by learning the mapping between states and actions, enabling agents to execute tasks based on demonstrations. However, these methods rely on the acquisition of high-quality demonstration data, facing challenges such as difficulties in collecting expert samples, high costs, and low efficiency in policy learning. Particularly under limited sample conditions, imitation learning can easily get trapped in local optima, leading to lower success rates and accuracy in decision-making. Researchers have used data augmentation and transfer learning to tackle limited data. However, in complex scenarios, these methods are less effective due to a lack of domain-specific knowledge, which affects the interpretability of the model. To address these challenges, we propose an Intention-guided Imitation Learning method under limited expert demonstration data (ITIL), which extracts deep intent features from a small number of samples to enhance the agent’s understanding of the scene and improve the accuracy of the mapping from states to actions during Imitation Learning. Specifically, the core method consists of three modules: (1) Semantic Enhancement Module, which extracts spatiotemporal feature maps from a small number of raw trajectories to enrich the semantic information of expert data; (2) Intention Expression Module, which constructs an intention tree network to establish connections between different levels, effectively expressing and capturing expert intent; (3) Strategy Generation Module, which integrates the outputs of the first two modules as input to form efficient decision-making, creating a closed-loop architecture of cognitive understanding-knowledge expression-decision optimization. Experimental results show that our model outperforms baseline methods in navigation, capture, and formation tasks, with an average success rate improvement of approximately +6 % compared to the baseline method (ValueDICE).

Abstract Image

查看原文本刊更多论文

有限专家示范数据下的意向引导模仿学习方法

模仿学习在机器人控制、自动驾驶、无人船决策等多个领域取得了显著成果。该技术旨在通过学习状态和动作之间的映射来模拟特定任务中的人类行为，使代理能够基于演示执行任务。然而，这些方法依赖于获取高质量的示范数据，面临专家样本收集困难、成本高、政策学习效率低等挑战。特别是在有限的样本条件下，模仿学习容易陷入局部最优，导致决策的成功率和准确率较低。研究人员已经使用数据增强和迁移学习来处理有限的数据。然而，在复杂的场景中，由于缺乏特定于领域的知识，这些方法的有效性较低，这会影响模型的可解释性。为了解决这些挑战，我们提出了一种基于有限专家演示数据（ITIL）的意图引导模仿学习方法，该方法从少量样本中提取深度意图特征，以增强智能体对场景的理解，提高模仿学习过程中从状态到动作映射的准确性。具体来说，核心方法包括三个模块：(1)语义增强模块，从少量原始轨迹中提取时空特征映射，丰富专家数据的语义信息；(2)意图表达模块，构建意图树网络，建立不同层次之间的联系，有效表达和捕获专家意图；(3)策略生成模块，将前两个模块的输出作为输入集成，形成高效决策，形成认知理解-知识表达-决策优化的闭环体系结构。实验结果表明，我们的模型在导航、捕获和编队任务方面优于基线方法，与基线方法（ValueDICE）相比，平均成功率提高了约6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.