Encoding flexible gait strategies in stick insects through data-driven inverse reinforcement learning.

IF 3.1 3区计算机科学 Q1 ENGINEERING, MULTIDISCIPLINARY

Bioinspiration & Biomimetics Pub Date : 2025-06-05 DOI:10.1088/1748-3190/addc26

Yuchen Wang, Mitsuhiro Hayashibe, Dai Owaki

{"title":"Encoding flexible gait strategies in stick insects through data-driven inverse reinforcement learning.","authors":"Yuchen Wang, Mitsuhiro Hayashibe, Dai Owaki","doi":"10.1088/1748-3190/addc26","DOIUrl":null,"url":null,"abstract":"<p><p>Stick insects exhibit remarkable adaptive walking capabilities across diverse environments; however, the mechanisms underlying their gait transitions remain poorly understood. Although reinforcement learning (RL) has been employed to generate insect-like gaits, the design of an appropriate reward function presents a challenge due to the probabilistic and continuous nature of gait transitions. This study utilized maximum entropy inverse RL to infer the reward function that governs stick insect gait selection, incorporating walking dynamic parameters-namely, velocity, direction, and acceleration-alongside antenna joint movements as state variables. By analyzing the inferred reward structures, we clarified the underlying principles that drive gait transitions and emphasized the role of sensory feedback in gait modulation. The efficacy of the inferred policies was validated through an assessment of their ability to reproduce expert trajectories, demonstrating that stick insect gaits can be learned from observable states during locomotion. Furthermore, interspecies variations and noncanonical gait patterns were examined, providing insights into the flexibility and adaptability of insect locomotion. This data-driven approach offers a biologically interpretable framework for gait modeling and contributes to bioinspired robotic design by facilitating adaptive control strategies for hexapod robots.</p>","PeriodicalId":55377,"journal":{"name":"Bioinspiration & Biomimetics","volume":" ","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinspiration & Biomimetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1088/1748-3190/addc26","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Stick insects exhibit remarkable adaptive walking capabilities across diverse environments; however, the mechanisms underlying their gait transitions remain poorly understood. Although reinforcement learning (RL) has been employed to generate insect-like gaits, the design of an appropriate reward function presents a challenge due to the probabilistic and continuous nature of gait transitions. This study utilized maximum entropy inverse RL to infer the reward function that governs stick insect gait selection, incorporating walking dynamic parameters-namely, velocity, direction, and acceleration-alongside antenna joint movements as state variables. By analyzing the inferred reward structures, we clarified the underlying principles that drive gait transitions and emphasized the role of sensory feedback in gait modulation. The efficacy of the inferred policies was validated through an assessment of their ability to reproduce expert trajectories, demonstrating that stick insect gaits can be learned from observable states during locomotion. Furthermore, interspecies variations and noncanonical gait patterns were examined, providing insights into the flexibility and adaptability of insect locomotion. This data-driven approach offers a biologically interpretable framework for gait modeling and contributes to bioinspired robotic design by facilitating adaptive control strategies for hexapod robots.

查看原文本刊更多论文

基于数据驱动逆强化学习的竹节虫柔性步态策略编码。

竹节虫在不同环境中表现出显著的适应性行走能力；然而，他们步态转变的机制仍然知之甚少。尽管强化学习（RL）已被用于生成类似昆虫的步态，但由于步态转换的概率性和连续性，适当的奖励函数的设计提出了挑战。本研究利用最大熵逆强化学习（MaxEnt-IRL）来推断控制竹节虫步态选择的奖励函数，并将行走动态参数（即速度、方向和加速度）以及天线关节运动作为状态变量。通过分析推断的奖赏结构，我们阐明了驱动步态转换的潜在原理，并强调了感觉反馈在步态调节中的作用。通过评估其复制专家轨迹的能力，验证了推断策略的有效性，证明竹节虫的步态可以从运动过程中的可观察状态中学习。此外，研究人员还研究了昆虫的种间变异和非典型步态模式，为昆虫运动的灵活性和适应性提供了见解。这种数据驱动的方法为步态建模提供了一个生物学上可解释的框架，并通过促进六足机器人的自适应控制策略，为仿生机器人设计做出了贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinspiration & Biomimetics 工程技术-材料科学：生物材料

CiteScore

5.90

自引率

14.70%

发文量

132

审稿时长

3 months

期刊介绍： Bioinspiration & Biomimetics publishes research involving the study and distillation of principles and functions found in biological systems that have been developed through evolution, and application of this knowledge to produce novel and exciting basic technologies and new approaches to solving scientific problems. It provides a forum for interdisciplinary research which acts as a pipeline, facilitating the two-way flow of ideas and understanding between the extensive bodies of knowledge of the different disciplines. It has two principal aims: to draw on biology to enrich engineering and to draw from engineering to enrich biology. The journal aims to include input from across all intersecting areas of both fields. In biology, this would include work in all fields from physiology to ecology, with either zoological or botanical focus. In engineering, this would include both design and practical application of biomimetic or bioinspired devices and systems. Typical areas of interest include: Systems, designs and structure Communication and navigation Cooperative behaviour Self-organizing biological systems Self-healing and self-assembly Aerial locomotion and aerospace applications of biomimetics Biomorphic surface and subsurface systems Marine dynamics: swimming and underwater dynamics Applications of novel materials Biomechanics; including movement, locomotion, fluidics Cellular behaviour Sensors and senses Biomimetic or bioinformed approaches to geological exploration.