{"title":"RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation","authors":"Qingyao Li, Wei Xia, Kounianhua Du, Xinyi Dai, Ruiming Tang, Yasheng Wang, Yong Yu, Weinan Zhang","doi":"arxiv-2409.09584","DOIUrl":null,"url":null,"abstract":"LLM agents enhanced by tree search algorithms have yielded notable\nperformances in code generation. However, current search algorithms in this\ndomain suffer from low search quality due to several reasons: 1) Ineffective\ndesign of the search space for the high-reasoning demands of code generation\ntasks, 2) Inadequate integration of code feedback with the search algorithm,\nand 3) Poor handling of negative feedback during the search, leading to reduced\nsearch efficiency and quality. To address these challenges, we propose to\nsearch for the reasoning process of the code and use the detailed feedback of\ncode execution to refine erroneous thoughts during the search. In this paper,\nwe introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS)\nalgorithm to conduct thought-level searches before generating code, thereby\nexploring a wider range of strategies. More importantly, we construct verbal\nfeedback from fine-grained code execution feedback to refine erroneous thoughts\nduring the search. This ensures that the search progresses along the correct\nreasoning paths, thus improving the overall search quality of the tree by\nleveraging execution feedback. Through extensive experiments, we demonstrate\nthat RethinkMCTS outperforms previous search-based and feedback-based code\ngeneration baselines. On the HumanEval dataset, it improves the pass@1 of\nGPT-3.5-turbo from 70.12 to 89.02 and GPT-4o-mini from 87.20 to 94.51. It\neffectively conducts more thorough exploration through thought-level searches\nand enhances the search quality of the entire tree by incorporating rethink\noperation.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"211 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
LLM agents enhanced by tree search algorithms have yielded notable
performances in code generation. However, current search algorithms in this
domain suffer from low search quality due to several reasons: 1) Ineffective
design of the search space for the high-reasoning demands of code generation
tasks, 2) Inadequate integration of code feedback with the search algorithm,
and 3) Poor handling of negative feedback during the search, leading to reduced
search efficiency and quality. To address these challenges, we propose to
search for the reasoning process of the code and use the detailed feedback of
code execution to refine erroneous thoughts during the search. In this paper,
we introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS)
algorithm to conduct thought-level searches before generating code, thereby
exploring a wider range of strategies. More importantly, we construct verbal
feedback from fine-grained code execution feedback to refine erroneous thoughts
during the search. This ensures that the search progresses along the correct
reasoning paths, thus improving the overall search quality of the tree by
leveraging execution feedback. Through extensive experiments, we demonstrate
that RethinkMCTS outperforms previous search-based and feedback-based code
generation baselines. On the HumanEval dataset, it improves the pass@1 of
GPT-3.5-turbo from 70.12 to 89.02 and GPT-4o-mini from 87.20 to 94.51. It
effectively conducts more thorough exploration through thought-level searches
and enhances the search quality of the entire tree by incorporating rethink
operation.