Deep Reinforcement Learning for Synthesizing Functions in Higher-Order Logic

Logic Programming and Automated Reasoning Pub Date : 2019-10-25 DOI:10.29007/7jmg

Thibault Gauthier

引用次数: 11

Abstract

The paper describes a deep reinforcement learning framework based on self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when a task is expressed as a search problem. In this case, a Monte Carlo Tree Search (MCTS) algorithm guided by a TNN can be used to explore the search space and produce better examples for training the next TNN. As an illustration, term synthesis tasks on combinators and Diophantine equations are specified and learned. We achieve a success rate of 65% on combinator synthesis problems outperforming state-of-the-art ATPs run with their best general set of strategies. We set a precedent for statistically guided synthesis of Diophantine equations by solving 78.5% of the generated test problems.

查看原文本刊更多论文

高阶逻辑中综合函数的深度强化学习

本文描述了一个基于自监督学习的深度强化学习框架。通过选择树状神经网络(tnn)作为机器学习模型，并在内部使用HOL4术语来表示tnn的树状结构，实现了机器学习模块与HOL4库之间的密切交互。当任务被表示为搜索问题时，递归改进是可能的。在这种情况下，可以使用由TNN引导的蒙特卡罗树搜索(MCTS)算法来探索搜索空间，并为训练下一个TNN产生更好的示例。作为说明，我们详细说明并学习了组合子和丢番图方程的项合成任务。我们在组合子合成问题上取得了65%的成功率，优于最先进的atp，它们使用最佳通用策略集运行。我们通过解决78.5%的生成测试问题，开创了统计指导下丢番图方程合成的先例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Logic Programming and Automated Reasoning

自引率

0.00%

发文量