{"title":"Reinforcement Learning for Guiding the E Theorem Prover","authors":"Jack McKeown, G. Sutcliffe","doi":"10.32473/flairs.36.133334","DOIUrl":null,"url":null,"abstract":"Automated Theorem Proving (ATP) systems search for aproof in a rapidly growing space of possibilities. Heuristicshave a profound impact on search, and ATP systems makeheavy use of heuristics. This work uses reinforcement learn-ing to learn a metaheuristic that decides which heuristic to useat each step of a proof search in the E ATP system. Proximalpolicy optimization is used to dynamically select a heuristicfrom a fixed set, based on the current state of E. The approachis evaluated on its ability to reduce the number of inferencesteps used in successful proof searches, as an indicator of in-telligent search.","PeriodicalId":302103,"journal":{"name":"The International FLAIRS Conference Proceedings","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The International FLAIRS Conference Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32473/flairs.36.133334","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Automated Theorem Proving (ATP) systems search for aproof in a rapidly growing space of possibilities. Heuristicshave a profound impact on search, and ATP systems makeheavy use of heuristics. This work uses reinforcement learn-ing to learn a metaheuristic that decides which heuristic to useat each step of a proof search in the E ATP system. Proximalpolicy optimization is used to dynamically select a heuristicfrom a fixed set, based on the current state of E. The approachis evaluated on its ability to reduce the number of inferencesteps used in successful proof searches, as an indicator of in-telligent search.