Dehallucinating Large Language Models Using Formal Methods Guided Iterative Prompting

2023 IEEE International Conference on Assured Autonomy (ICAA) Pub Date : 2023-06-01 DOI:10.1109/ICAA58325.2023.00029

Susmit Jha, Sumit Kumar Jha, P. Lincoln, Nathaniel D. Bastian, Alvaro Velasquez, S. Neema

{"title":"Dehallucinating Large Language Models Using Formal Methods Guided Iterative Prompting","authors":"Susmit Jha, Sumit Kumar Jha, P. Lincoln, Nathaniel D. Bastian, Alvaro Velasquez, S. Neema","doi":"10.1109/ICAA58325.2023.00029","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) such as ChatGPT have been trained to generate human-like responses to natural language prompts. LLMs use a vast corpus of text data for training, and can generate coherent and contextually relevant responses to a wide range of questions and statements. Despite this remarkable progress, LLMs are prone to hallucinations making their application to safety-critical applications such as autonomous systems difficult. The hallucinations in LLMs refer to instances where the model generates responses that are not factually accurate or contextually appropriate. These hallucinations can occur due to a variety of factors, such as the model’s lack of real-world knowledge, the influence of biased or inaccurate training data, or the model’s tendency to generate responses based on statistical patterns rather than a true understanding of the input. While these hallucinations are a nuisance in tasks such as text summarization and question-answering, they can be catastrophic when LLMs are used in autonomy-relevant applications such as planning. In this paper, we focus on the application of LLMs in autonomous systems and sketch a novel self-monitoring and iterative prompting architecture that uses formal methods to detect these errors in the LLM response automatically. We exploit the dialog capability of LLMs to iteratively steer them to responses that are consistent with our correctness specification. We report preliminary experiments that show the promise of the proposed approach on tasks such as automated planning.","PeriodicalId":190198,"journal":{"name":"2023 IEEE International Conference on Assured Autonomy (ICAA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Assured Autonomy (ICAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAA58325.2023.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Large language models (LLMs) such as ChatGPT have been trained to generate human-like responses to natural language prompts. LLMs use a vast corpus of text data for training, and can generate coherent and contextually relevant responses to a wide range of questions and statements. Despite this remarkable progress, LLMs are prone to hallucinations making their application to safety-critical applications such as autonomous systems difficult. The hallucinations in LLMs refer to instances where the model generates responses that are not factually accurate or contextually appropriate. These hallucinations can occur due to a variety of factors, such as the model’s lack of real-world knowledge, the influence of biased or inaccurate training data, or the model’s tendency to generate responses based on statistical patterns rather than a true understanding of the input. While these hallucinations are a nuisance in tasks such as text summarization and question-answering, they can be catastrophic when LLMs are used in autonomy-relevant applications such as planning. In this paper, we focus on the application of LLMs in autonomous systems and sketch a novel self-monitoring and iterative prompting architecture that uses formal methods to detect these errors in the LLM response automatically. We exploit the dialog capability of LLMs to iteratively steer them to responses that are consistent with our correctness specification. We report preliminary experiments that show the promise of the proposed approach on tasks such as automated planning.

查看原文本刊更多论文

使用形式方法引导迭代提示消除大型语言模型的幻觉

像ChatGPT这样的大型语言模型(llm)已经被训练成对自然语言提示产生类似人类的反应。法学硕士使用大量的文本数据语料库进行训练，并可以对各种问题和陈述生成连贯且与上下文相关的响应。尽管取得了显著的进步，llm仍容易产生幻觉，这使得它们难以应用于安全关键应用，如自主系统。法学硕士中的幻觉指的是模型产生的反应不准确或上下文不合适的情况。这些幻觉可能是由于多种因素造成的，比如模型缺乏现实世界的知识，有偏见或不准确的训练数据的影响，或者模型倾向于基于统计模式而不是对输入的真正理解来生成响应。虽然这些幻觉在文本摘要和问题回答等任务中令人讨厌，但当llm用于与自主性相关的应用程序(如规划)时，它们可能是灾难性的。在本文中，我们专注于LLM在自治系统中的应用，并概述了一种新的自监测和迭代提示体系结构，该体系结构使用形式化方法自动检测LLM响应中的这些错误。我们利用llm的对话功能来迭代地引导它们得到与我们的正确性规范一致的响应。我们报告了初步的实验，显示了所提出的方法在自动规划等任务上的前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE International Conference on Assured Autonomy (ICAA)

自引率

0.00%

发文量