A Multistep Reinforcement Learning Control of Shear Flows in Minimal Input–Output Plants Under Large Time-delays

IF 2.4 3区工程技术 Q3 MECHANICS

Flow, Turbulence and Combustion Pub Date : 2025-09-15 DOI:10.1007/s10494-025-00697-w

Amine Saibi, Lionel Mathelin, Onofrio Semeraro

{"title":"A Multistep Reinforcement Learning Control of Shear Flows in Minimal Input–Output Plants Under Large Time-delays","authors":"Amine Saibi, Lionel Mathelin, Onofrio Semeraro","doi":"10.1007/s10494-025-00697-w","DOIUrl":null,"url":null,"abstract":"<div><p>Flow control has attracted research for its potential role in reducing drag, suppressing turbulence, and enhancing mixing in fluid systems. The emergence of data-driven modeling and machine learning techniques has sparked new interest in designing control strategies that can adapt in real time to complex, high-dimensional flow environments. However, fluid systems remain particularly challenging testbeds for control design due to their nonlinear and convective nature, which introduces large time delays. In active control, additional difficulties arise from practical constraints, such as the use of localized sensors in limited number. In this work, we investigate a reinforcement learning framework based on a suitable actor–critic algorithm designed to address these challenges. Two test cases representative of transitional shear flows are considered: a linearized version of the Kuramoto–Sivashinsky equation and the control of instabilities in a two-dimensional boundary-layer flow over a flat plate, using a minimal but realistic sensor–actuator configuration. This choice reflects our focus on the limitations that arise from plants of experimental interest. Time delays are identified during a pretraining stage, while the control algorithm employs multistep returns during value iteration. This approach improves both the convergence rate and stability of learning. Furthermore, we show that the look-ahead in the multistep formulation provides a non-trivial beneficial effect in plants where the control task is characterized by a severe credit-assignment issue.</p></div>","PeriodicalId":559,"journal":{"name":"Flow, Turbulence and Combustion","volume":"115 :","pages":"1379 - 1402"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Flow, Turbulence and Combustion","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s10494-025-00697-w","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MECHANICS","Score":null,"Total":0}

引用次数: 0

Abstract

Flow control has attracted research for its potential role in reducing drag, suppressing turbulence, and enhancing mixing in fluid systems. The emergence of data-driven modeling and machine learning techniques has sparked new interest in designing control strategies that can adapt in real time to complex, high-dimensional flow environments. However, fluid systems remain particularly challenging testbeds for control design due to their nonlinear and convective nature, which introduces large time delays. In active control, additional difficulties arise from practical constraints, such as the use of localized sensors in limited number. In this work, we investigate a reinforcement learning framework based on a suitable actor–critic algorithm designed to address these challenges. Two test cases representative of transitional shear flows are considered: a linearized version of the Kuramoto–Sivashinsky equation and the control of instabilities in a two-dimensional boundary-layer flow over a flat plate, using a minimal but realistic sensor–actuator configuration. This choice reflects our focus on the limitations that arise from plants of experimental interest. Time delays are identified during a pretraining stage, while the control algorithm employs multistep returns during value iteration. This approach improves both the convergence rate and stability of learning. Furthermore, we show that the look-ahead in the multistep formulation provides a non-trivial beneficial effect in plants where the control task is characterized by a severe credit-assignment issue.

查看原文本刊更多论文

大时滞下最小输入输出装置剪切流的多步强化学习控制

流动控制因其在减少阻力、抑制湍流和增强流体系统混合方面的潜在作用而引起了人们的研究。数据驱动建模和机器学习技术的出现引发了人们对设计能够实时适应复杂、高维流环境的控制策略的新兴趣。然而，由于流体系统的非线性和对流特性，会带来很大的时间延迟，因此在控制设计的测试平台上，流体系统仍然是一个特别具有挑战性的问题。在主动控制中，由于实际限制，例如使用有限数量的局部传感器，会产生额外的困难。在这项工作中，我们研究了一种基于合适的演员-评论家算法的强化学习框架，旨在解决这些挑战。考虑了两个具有代表性的过渡剪切流的测试用例：Kuramoto-Sivashinsky方程的线性化版本和平面上二维边界层流动的不稳定性控制，使用最小但现实的传感器-执行器配置。这一选择反映了我们对实验兴趣植物所产生的局限性的关注。在预训练阶段识别时间延迟，而控制算法在值迭代过程中采用多步返回。这种方法提高了学习的收敛速度和稳定性。此外，我们表明，多步骤公式中的前瞻性在控制任务具有严重信用分配问题的植物中提供了重要的有益效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Flow, Turbulence and Combustion 工程技术-力学

CiteScore

5.70

自引率

8.30%

发文量

审稿时长

2 months

期刊介绍： Flow, Turbulence and Combustion provides a global forum for the publication of original and innovative research results that contribute to the solution of fundamental and applied problems encountered in single-phase, multi-phase and reacting flows, in both idealized and real systems. The scope of coverage encompasses topics in fluid dynamics, scalar transport, multi-physics interactions and flow control. From time to time the journal publishes Special or Theme Issues featuring invited articles. Contributions may report research that falls within the broad spectrum of analytical, computational and experimental methods. This includes research conducted in academia, industry and a variety of environmental and geophysical sectors. Turbulence, transition and associated phenomena are expected to play a significant role in the majority of studies reported, although non-turbulent flows, typical of those in micro-devices, would be regarded as falling within the scope covered. The emphasis is on originality, timeliness, quality and thematic fit, as exemplified by the title of the journal and the qualifications described above. Relevance to real-world problems and industrial applications are regarded as strengths.