An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

IF 5.4 2区社会学 Q1 POLITICAL SCIENCE

Political Analysis Pub Date : 2022-03-21 DOI:10.1017/pan.2023.2

Lisa P. Argyle, E. Busby, Nancy Fulda, Joshua R Gubler, Christopher Rytting, Taylor Sorensen, D. Wingate

引用次数: 93

Abstract

Pre-trained language models derive substantial linguistic and factual knowledge from the massive corpora on which they are trained, and prompt engineering seeks to align these models to specific tasks. Unfortunately, existing prompt engineering methods require significant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. Specifically, over a set of candidate templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90% of the way from the average prompt accuracy to the best prompt accuracy and requires no ground truth labels.

查看原文本刊更多论文

一种无地面实况标签的提示工程信息论方法

经过预训练的语言模型从其训练的大量语料库中获得大量的语言和事实知识，而即时工程则试图将这些模型与特定任务相结合。不幸的是，现有的快速工程方法需要大量的标记数据、对模型参数的访问，或者两者兼而有之。我们介绍了一种新的方法来选择提示模板，无需标记示例，也无需直接访问模型。具体来说，在一组候选模板上，我们选择最大化输入和相应模型输出之间相互信息的模板。在代表7个不同NLP任务的8个数据集中，我们表明，当模板具有高互信息时，它在任务上也具有高准确性。在最大的模型上，用我们的方法选择提示可以获得从平均提示准确度到最佳提示准确度的90%，并且不需要地面实况标签。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Political Analysis POLITICAL SCIENCE-

CiteScore

8.80

自引率

3.70%

发文量

期刊介绍： Political Analysis chronicles these exciting developments by publishing the most sophisticated scholarship in the field. It is the place to learn new methods, to find some of the best empirical scholarship, and to publish your best research.