液相色谱中自动化方法开发的强化学习：奖励方案和实验预算选择的见解

IF 4 2区化学 Q1 BIOCHEMICAL RESEARCH METHODS

Journal of Chromatography A Pub Date : 2025-05-10 Epub Date: 2025-03-06 DOI:10.1016/j.chroma.2025.465845

Leon E. Niezen , Pieter J.K. Libin , Deirdre Cabooter , Gert Desmet

{"title":"液相色谱中自动化方法开发的强化学习：奖励方案和实验预算选择的见解","authors":"Leon E. Niezen , Pieter J.K. Libin , Deirdre Cabooter , Gert Desmet","doi":"10.1016/j.chroma.2025.465845","DOIUrl":null,"url":null,"abstract":"<div><div>Chromatographic problem solving, commonly referred to as method development (MD), is hugely complex, given the many operational parameters that must be optimized and their large effect on the elution times of individual sample compounds. Recently, the use of reinforcement learning has been proposed to automate and expedite this process for liquid chromatography (LC). This study further explores deep reinforcement learning (RL) for LC method development. Given the large training budgets required, an <em>in-silico</em> approach was taken to train several Proximal Policy Optimization (PPO) agents. High-performing PPO agents were trained using sparse rewards (=rewarding only when all sample components were fully separated) and large experimental budgets. Strategies like frame stacking or long short-term memory networks were also investigated to improve the agents further. The trained agents were benchmarked against a Bayesian Optimization (BO) algorithm using a set of 1000 randomly-composed samples. Both algorithms were tasked with finding gradient programs that fully resolved all compounds in the samples, using a minimal number of experiments. When the number of parameters to tune was limited (single-segment gradient programs) PPO required on average, 1 to 2 fewer experiments, but did not outperform BO with respect to the number of solutions found, with PPO and BO solving 17% and 19% of the most challenging samples, respectively. However, PPO excelled at more complex tasks involving a higher number of parameters. As an example, when optimizing a five-segment gradient PPO solved 31% of samples, while BO solved 24% of samples.</div></div>","PeriodicalId":347,"journal":{"name":"Journal of Chromatography A","volume":"1748 ","pages":"Article 465845"},"PeriodicalIF":4.0000,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning for automated method development in liquid chromatography: insights in the reward scheme and experimental budget selection\",\"authors\":\"Leon E. Niezen , Pieter J.K. Libin , Deirdre Cabooter , Gert Desmet\",\"doi\":\"10.1016/j.chroma.2025.465845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chromatographic problem solving, commonly referred to as method development (MD), is hugely complex, given the many operational parameters that must be optimized and their large effect on the elution times of individual sample compounds. Recently, the use of reinforcement learning has been proposed to automate and expedite this process for liquid chromatography (LC). This study further explores deep reinforcement learning (RL) for LC method development. Given the large training budgets required, an <em>in-silico</em> approach was taken to train several Proximal Policy Optimization (PPO) agents. High-performing PPO agents were trained using sparse rewards (=rewarding only when all sample components were fully separated) and large experimental budgets. Strategies like frame stacking or long short-term memory networks were also investigated to improve the agents further. The trained agents were benchmarked against a Bayesian Optimization (BO) algorithm using a set of 1000 randomly-composed samples. Both algorithms were tasked with finding gradient programs that fully resolved all compounds in the samples, using a minimal number of experiments. When the number of parameters to tune was limited (single-segment gradient programs) PPO required on average, 1 to 2 fewer experiments, but did not outperform BO with respect to the number of solutions found, with PPO and BO solving 17% and 19% of the most challenging samples, respectively. However, PPO excelled at more complex tasks involving a higher number of parameters. As an example, when optimizing a five-segment gradient PPO solved 31% of samples, while BO solved 24% of samples.</div></div>\",\"PeriodicalId\":347,\"journal\":{\"name\":\"Journal of Chromatography A\",\"volume\":\"1748 \",\"pages\":\"Article 465845\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chromatography A\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0021967325001931\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chromatography A","FirstCategoryId":"1","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021967325001931","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

色谱问题的解决，通常被称为方法开发（MD），是非常复杂的，因为必须优化许多操作参数，并且它们对单个样品化合物的洗脱时间有很大影响。最近，有人提出使用强化学习来自动化和加快液相色谱（LC）的这一过程。本研究进一步探讨了深度强化学习（RL）在LC方法开发中的应用。考虑到所需的大量培训预算，采用了一种计算机方法来训练几个近端策略优化（PPO）代理。高效PPO代理的训练使用稀疏奖励（=只有当所有样本成分完全分离时才奖励）和大的实验预算。为了进一步提高智能体的记忆能力，还研究了帧叠加或长短期记忆网络等策略。训练后的智能体使用1000个随机组成的样本对贝叶斯优化（BO）算法进行基准测试。这两种算法的任务都是使用最少的实验次数，找到能够完全分解样品中所有化合物的梯度程序。当需要调整的参数数量有限（单段梯度程序）时，PPO平均需要1到2个实验，但在找到的解决方案数量方面并不优于BO， PPO和BO分别解决了17%和19%的最具挑战性的样本。然而，PPO在涉及更多参数的更复杂的任务中表现出色。例如，在优化五段梯度时，PPO解决了31%的样本，而BO解决了24%的样本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement learning for automated method development in liquid chromatography: insights in the reward scheme and experimental budget selection

Chromatographic problem solving, commonly referred to as method development (MD), is hugely complex, given the many operational parameters that must be optimized and their large effect on the elution times of individual sample compounds. Recently, the use of reinforcement learning has been proposed to automate and expedite this process for liquid chromatography (LC). This study further explores deep reinforcement learning (RL) for LC method development. Given the large training budgets required, an in-silico approach was taken to train several Proximal Policy Optimization (PPO) agents. High-performing PPO agents were trained using sparse rewards (=rewarding only when all sample components were fully separated) and large experimental budgets. Strategies like frame stacking or long short-term memory networks were also investigated to improve the agents further. The trained agents were benchmarked against a Bayesian Optimization (BO) algorithm using a set of 1000 randomly-composed samples. Both algorithms were tasked with finding gradient programs that fully resolved all compounds in the samples, using a minimal number of experiments. When the number of parameters to tune was limited (single-segment gradient programs) PPO required on average, 1 to 2 fewer experiments, but did not outperform BO with respect to the number of solutions found, with PPO and BO solving 17% and 19% of the most challenging samples, respectively. However, PPO excelled at more complex tasks involving a higher number of parameters. As an example, when optimizing a five-segment gradient PPO solved 31% of samples, while BO solved 24% of samples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Chromatography A 化学-分析化学

CiteScore

7.90

自引率

14.60%

发文量

742

审稿时长

45 days

期刊介绍： The Journal of Chromatography A provides a forum for the publication of original research and critical reviews on all aspects of fundamental and applied separation science. The scope of the journal includes chromatography and related techniques, electromigration techniques (e.g. electrophoresis, electrochromatography), hyphenated and other multi-dimensional techniques, sample preparation, and detection methods such as mass spectrometry. Contributions consist mainly of research papers dealing with the theory of separation methods, instrumental developments and analytical and preparative applications of general interest.