{"title":"Fast Analysis of the OpenAI O1-Preview Model in Solving Random K-SAT Problem: Does the LLM Solve the Problem Itself or Call an External SAT Solver?","authors":"Raffaele Marino","doi":"arxiv-2409.11232","DOIUrl":null,"url":null,"abstract":"In this manuscript I present an analysis on the performance of OpenAI\nO1-preview model in solving random K-SAT instances for K$\\in {2,3,4}$ as a\nfunction of $\\alpha=M/N$ where $M$ is the number of clauses and $N$ is the\nnumber of variables of the satisfiable problem. I show that the model can call\nan external SAT solver to solve the instances, rather than solving them\ndirectly. Despite using external solvers, the model reports incorrect\nassignments as output. Moreover, I propose and present an analysis to quantify\nwhether the OpenAI O1-preview model demonstrates a spark of intelligence or\nmerely makes random guesses when outputting an assignment for a Boolean\nsatisfiability problem.","PeriodicalId":501066,"journal":{"name":"arXiv - PHYS - Disordered Systems and Neural Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Disordered Systems and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this manuscript I present an analysis on the performance of OpenAI
O1-preview model in solving random K-SAT instances for K$\in {2,3,4}$ as a
function of $\alpha=M/N$ where $M$ is the number of clauses and $N$ is the
number of variables of the satisfiable problem. I show that the model can call
an external SAT solver to solve the instances, rather than solving them
directly. Despite using external solvers, the model reports incorrect
assignments as output. Moreover, I propose and present an analysis to quantify
whether the OpenAI O1-preview model demonstrates a spark of intelligence or
merely makes random guesses when outputting an assignment for a Boolean
satisfiability problem.