Fast Analysis of the OpenAI O1-Preview Model in Solving Random K-SAT Problem: Does the LLM Solve the Problem Itself or Call an External SAT Solver?

arXiv - PHYS - Disordered Systems and Neural Networks Pub Date : 2024-09-17 DOI:arxiv-2409.11232

Raffaele Marino

引用次数: 0

Abstract

In this manuscript I present an analysis on the performance of OpenAI O1-preview model in solving random K-SAT instances for K$\in {2,3,4}$ as a function of $\alpha=M/N$ where $M$ is the number of clauses and $N$ is the number of variables of the satisfiable problem. I show that the model can call an external SAT solver to solve the instances, rather than solving them directly. Despite using external solvers, the model reports incorrect assignments as output. Moreover, I propose and present an analysis to quantify whether the OpenAI O1-preview model demonstrates a spark of intelligence or merely makes random guesses when outputting an assignment for a Boolean satisfiability problem.

查看原文本刊更多论文

快速分析 OpenAI O1-Preview 模型在解决随机 K-SAT 问题中的作用：LLM 是自己解决问题还是调用外部 SAT 解算器？

在本手稿中，我分析了 OpenAIO1-preview 模型在求解 K$\in {2,3,4}$ 的随机 K-SAT 实例时的性能，K$\in {2,3,4}$ 是 $\alpha=M/N$ 的函数，其中 $M$ 是条款数，$N$ 是可满足问题的变量数。我展示了该模型可以调用外部 SAT 求解器来求解实例，而不是直接求解。尽管使用了外部求解器，模型还是会将错误的分配作为输出报告。此外，我还提出并展示了一项分析，以量化OpenAI O1-preview模型在输出布尔可满足性问题的赋值时，是展现出了智慧的火花，还是仅仅是随机猜测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - PHYS - Disordered Systems and Neural Networks

自引率

0.00%

发文量