Gradient-Based Framework for Bilevel Optimization of Black-Box Functions: Synergizing Model-Free Reinforcement Learning and Implicit Function Differentiation
{"title":"Gradient-Based Framework for Bilevel Optimization of Black-Box Functions: Synergizing Model-Free Reinforcement Learning and Implicit Function Differentiation","authors":"Thomas Banker, and , Ali Mesbah*, ","doi":"10.1021/acs.iecr.4c0358410.1021/acs.iecr.4c03584","DOIUrl":null,"url":null,"abstract":"<p >Bilevel optimization problems are challenging to solve due to the complex interplay between upper-level and lower-level decision variables. Classical solution methods generally simplify the bilevel problem to a single level problem, whereas more recent methods such as evolutionary algorithms and Bayesian optimization take a black-box view that can suffer from scalability to larger problems. While advantageous for handling high-dimensional and nonconvex optimization problems, the application of gradient-based solution methods to bilevel problems is impeded by the implicit relationship between the upper-level and lower-level decision variables. Additionally, lack of an equation-oriented relationship between decision variables and the upper-level objective can further impede differentiability. To this end, we present a gradient-based optimization framework that leverages implicit function theorem and model-free reinforcement learning (RL) to solve bilevel optimization problems wherein only zeroth-order observations of the upper-level objective are available. Implicit differentiation allows for differentiating the optimality conditions of the lower-level problem to enable calculation of gradients of the upper-level objective. Using policy gradient RL, gradient-based updates of the upper-level decisions can then be performed in a scalable manner for high-dimension problems. The proposed framework is applied to the bilevel problem of learning optimization-based control policies for uncertain systems. Simulation results on two benchmark problems illustrate the effectiveness of the framework for goal-oriented learning of model predictive control policies. Synergizing derivative-free optimization via model-free RL and gradient calculation via implicit function differentiation can create new avenues for scalable and efficient solution of bilevel problems with black-box upper-level objective as compared to black-box optimization methods that discard the problem structure.</p>","PeriodicalId":39,"journal":{"name":"Industrial & Engineering Chemistry Research","volume":"64 5","pages":"2831–2844 2831–2844"},"PeriodicalIF":3.9000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.iecr.4c03584","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Industrial & Engineering Chemistry Research","FirstCategoryId":"5","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.iecr.4c03584","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Bilevel optimization problems are challenging to solve due to the complex interplay between upper-level and lower-level decision variables. Classical solution methods generally simplify the bilevel problem to a single level problem, whereas more recent methods such as evolutionary algorithms and Bayesian optimization take a black-box view that can suffer from scalability to larger problems. While advantageous for handling high-dimensional and nonconvex optimization problems, the application of gradient-based solution methods to bilevel problems is impeded by the implicit relationship between the upper-level and lower-level decision variables. Additionally, lack of an equation-oriented relationship between decision variables and the upper-level objective can further impede differentiability. To this end, we present a gradient-based optimization framework that leverages implicit function theorem and model-free reinforcement learning (RL) to solve bilevel optimization problems wherein only zeroth-order observations of the upper-level objective are available. Implicit differentiation allows for differentiating the optimality conditions of the lower-level problem to enable calculation of gradients of the upper-level objective. Using policy gradient RL, gradient-based updates of the upper-level decisions can then be performed in a scalable manner for high-dimension problems. The proposed framework is applied to the bilevel problem of learning optimization-based control policies for uncertain systems. Simulation results on two benchmark problems illustrate the effectiveness of the framework for goal-oriented learning of model predictive control policies. Synergizing derivative-free optimization via model-free RL and gradient calculation via implicit function differentiation can create new avenues for scalable and efficient solution of bilevel problems with black-box upper-level objective as compared to black-box optimization methods that discard the problem structure.
期刊介绍:
ndustrial & Engineering Chemistry, with variations in title and format, has been published since 1909 by the American Chemical Society. Industrial & Engineering Chemistry Research is a weekly publication that reports industrial and academic research in the broad fields of applied chemistry and chemical engineering with special focus on fundamentals, processes, and products.