Reinforcement learning for automated method development in liquid chromatography: insights in the reward scheme and experimental budget selection

IF 3.8 2区化学 Q1 BIOCHEMICAL RESEARCH METHODS

Journal of Chromatography A Pub Date : 2025-03-06 DOI:10.1016/j.chroma.2025.465845

Leon E. Niezen , Pieter J.K. Libin , Deirdre Cabooter , Gert Desmet

{"title":"Reinforcement learning for automated method development in liquid chromatography: insights in the reward scheme and experimental budget selection","authors":"Leon E. Niezen , Pieter J.K. Libin , Deirdre Cabooter , Gert Desmet","doi":"10.1016/j.chroma.2025.465845","DOIUrl":null,"url":null,"abstract":"<div><div>Chromatographic problem solving, commonly referred to as method development (MD), is hugely complex, given the many operational parameters that must be optimized and their large effect on the elution times of individual sample compounds. Recently, the use of reinforcement learning has been proposed to automate and expedite this process for liquid chromatography (LC). This study further explores deep reinforcement learning (RL) for LC method development. Given the large training budgets required, an <em>in-silico</em> approach was taken to train several Proximal Policy Optimization (PPO) agents. High-performing PPO agents were trained using sparse rewards (=rewarding only when all sample components were fully separated) and large experimental budgets. Strategies like frame stacking or long short-term memory networks were also investigated to improve the agents further. The trained agents were benchmarked against a Bayesian Optimization (BO) algorithm using a set of 1000 randomly-composed samples. Both algorithms were tasked with finding gradient programs that fully resolved all compounds in the samples, using a minimal number of experiments. When the number of parameters to tune was limited (single-segment gradient programs) PPO required on average, 1 to 2 fewer experiments, but did not outperform BO with respect to the number of solutions found, with PPO and BO solving 17% and 19% of the most challenging samples, respectively. However, PPO excelled at more complex tasks involving a higher number of parameters. As an example, when optimizing a five-segment gradient PPO solved 31% of samples, while BO solved 24% of samples.</div></div>","PeriodicalId":347,"journal":{"name":"Journal of Chromatography A","volume":"1748 ","pages":"Article 465845"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chromatography A","FirstCategoryId":"1","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0021967325001931","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Chromatographic problem solving, commonly referred to as method development (MD), is hugely complex, given the many operational parameters that must be optimized and their large effect on the elution times of individual sample compounds. Recently, the use of reinforcement learning has been proposed to automate and expedite this process for liquid chromatography (LC). This study further explores deep reinforcement learning (RL) for LC method development. Given the large training budgets required, an in-silico approach was taken to train several Proximal Policy Optimization (PPO) agents. High-performing PPO agents were trained using sparse rewards (=rewarding only when all sample components were fully separated) and large experimental budgets. Strategies like frame stacking or long short-term memory networks were also investigated to improve the agents further. The trained agents were benchmarked against a Bayesian Optimization (BO) algorithm using a set of 1000 randomly-composed samples. Both algorithms were tasked with finding gradient programs that fully resolved all compounds in the samples, using a minimal number of experiments. When the number of parameters to tune was limited (single-segment gradient programs) PPO required on average, 1 to 2 fewer experiments, but did not outperform BO with respect to the number of solutions found, with PPO and BO solving 17% and 19% of the most challenging samples, respectively. However, PPO excelled at more complex tasks involving a higher number of parameters. As an example, when optimizing a five-segment gradient PPO solved 31% of samples, while BO solved 24% of samples.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Chromatography A 化学-分析化学

CiteScore

7.90

自引率

14.60%

发文量

742

审稿时长

45 days

期刊介绍： The Journal of Chromatography A provides a forum for the publication of original research and critical reviews on all aspects of fundamental and applied separation science. The scope of the journal includes chromatography and related techniques, electromigration techniques (e.g. electrophoresis, electrochromatography), hyphenated and other multi-dimensional techniques, sample preparation, and detection methods such as mass spectrometry. Contributions consist mainly of research papers dealing with the theory of separation methods, instrumental developments and analytical and preparative applications of general interest.