{"title":"Interpretable Intersection Control by Reinforcement Learning Agent With Linear Function Approximator","authors":"Somporn Sahachaiseree, Takashi Oguchi","doi":"10.1049/itr2.70034","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning (RL) is a promising machine-learning solution to traffic signal control problems, which have been extensively studied. However, variants of non-linear, deep artificial neural network (ANN) function approximators (FAs) have been predominantly employed in previous studies proposing RL-based controllers, leaving a significant interpretability issue due to their black-box nature. In this work, the use of the linear FA for a value-based RL agent in traffic signal control problems is investigated along with the least-squares <span></span><math>\n <semantics>\n <mi>Q</mi>\n <annotation>$Q$</annotation>\n </semantics></math>-learning method, abbreviated as <span></span><math>\n <semantics>\n <mrow>\n <mi>LSTD</mi>\n <mi>Q</mi>\n </mrow>\n <annotation>${\\rm LSTD}Q$</annotation>\n </semantics></math>. The interpretable linear FA was found to be adequate for the RL agent to learn an optimal policy. This leads to the proposal to replace a non-linear ANN FA with the linear FA counterpart, resolving the interpretability issue. Moreover, the <span></span><math>\n <semantics>\n <mrow>\n <mi>LSTD</mi>\n <mi>Q</mi>\n </mrow>\n <annotation>${\\rm LSTD}Q$</annotation>\n </semantics></math> learning method shows superior behaviour convergence compared to a gradient descent method. In a low-intensity arrival pattern scenario, the control by the RL agent cuts about half of the average delay resulting from the pretimed control. Owing to the conciseness of the linear FA, a direct interpretation analysis of the converged linear-FA parameters is presented. Lastly, two online relearning tests of the agents under non-stationary arrivals are conducted to demonstrate the online performance of <span></span><math>\n <semantics>\n <mrow>\n <mi>LSTD</mi>\n <mi>Q</mi>\n </mrow>\n <annotation>${\\rm LSTD}Q$</annotation>\n </semantics></math>. In conclusion, the linear-FA specification and the <span></span><math>\n <semantics>\n <mrow>\n <mi>LSTD</mi>\n <mi>Q</mi>\n </mrow>\n <annotation>${\\rm LSTD}Q$</annotation>\n </semantics></math> method are together proposed to be used for its control algorithm interpretability property, superior convergence quality, and lack of hyperparameters.</p>","PeriodicalId":50381,"journal":{"name":"IET Intelligent Transport Systems","volume":"19 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/itr2.70034","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Intelligent Transport Systems","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/itr2.70034","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Reinforcement learning (RL) is a promising machine-learning solution to traffic signal control problems, which have been extensively studied. However, variants of non-linear, deep artificial neural network (ANN) function approximators (FAs) have been predominantly employed in previous studies proposing RL-based controllers, leaving a significant interpretability issue due to their black-box nature. In this work, the use of the linear FA for a value-based RL agent in traffic signal control problems is investigated along with the least-squares -learning method, abbreviated as . The interpretable linear FA was found to be adequate for the RL agent to learn an optimal policy. This leads to the proposal to replace a non-linear ANN FA with the linear FA counterpart, resolving the interpretability issue. Moreover, the learning method shows superior behaviour convergence compared to a gradient descent method. In a low-intensity arrival pattern scenario, the control by the RL agent cuts about half of the average delay resulting from the pretimed control. Owing to the conciseness of the linear FA, a direct interpretation analysis of the converged linear-FA parameters is presented. Lastly, two online relearning tests of the agents under non-stationary arrivals are conducted to demonstrate the online performance of . In conclusion, the linear-FA specification and the method are together proposed to be used for its control algorithm interpretability property, superior convergence quality, and lack of hyperparameters.
期刊介绍:
IET Intelligent Transport Systems is an interdisciplinary journal devoted to research into the practical applications of ITS and infrastructures. The scope of the journal includes the following:
Sustainable traffic solutions
Deployments with enabling technologies
Pervasive monitoring
Applications; demonstrations and evaluation
Economic and behavioural analyses of ITS services and scenario
Data Integration and analytics
Information collection and processing; image processing applications in ITS
ITS aspects of electric vehicles
Autonomous vehicles; connected vehicle systems;
In-vehicle ITS, safety and vulnerable road user aspects
Mobility as a service systems
Traffic management and control
Public transport systems technologies
Fleet and public transport logistics
Emergency and incident management
Demand management and electronic payment systems
Traffic related air pollution management
Policy and institutional issues
Interoperability, standards and architectures
Funding scenarios
Enforcement
Human machine interaction
Education, training and outreach
Current Special Issue Call for papers:
Intelligent Transportation Systems in Smart Cities for Sustainable Environment - https://digital-library.theiet.org/files/IET_ITS_CFP_ITSSCSE.pdf
Sustainably Intelligent Mobility (SIM) - https://digital-library.theiet.org/files/IET_ITS_CFP_SIM.pdf
Traffic Theory and Modelling in the Era of Artificial Intelligence and Big Data (in collaboration with World Congress for Transport Research, WCTR 2019) - https://digital-library.theiet.org/files/IET_ITS_CFP_WCTR.pdf