调节少量推理的可靠性-可用性权衡的一般供应-检查成本框架

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2024-08-19 DOI:10.1007/s40747-024-01599-6

Fernando Martínez-Plumed, Gonzalo Jaimovitch-López, Cèsar Ferri, María José Ramírez-Quintana, José Hernández-Orallo

{"title":"调节少量推理的可靠性-可用性权衡的一般供应-检查成本框架","authors":"Fernando Martínez-Plumed, Gonzalo Jaimovitch-López, Cèsar Ferri, María José Ramírez-Quintana, José Hernández-Orallo","doi":"10.1007/s40747-024-01599-6","DOIUrl":null,"url":null,"abstract":"<p>Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"12 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference\",\"authors\":\"Fernando Martínez-Plumed, Gonzalo Jaimovitch-López, Cèsar Ferri, María José Ramírez-Quintana, José Hernández-Orallo\",\"doi\":\"10.1007/s40747-024-01599-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01599-6\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01599-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

语言模型和其他最新的机器学习范式模糊了生成性任务和判别性任务之间的区别，它们是一个连续统一体，受用户所需的事前和事后监督程度以及可容忍的误差水平的制约。在少量推理中，我们需要在必须提供的求解示例的数量和成本、必须检查的示例（其中一些是准确的，但另一些需要纠正）和错误但未被发现的示例之间找到一个平衡点。在本文中，我们定义了一个新的 "供应-检查成本框架"、相关的图形表示和综合指标，以考虑所有这些要素。为了优化特定运行条件下的少量推断，我们引入了新的算法，超越了静态和动态情况下剔除规则的概念。我们说明了所有这些要素在数据处理这一变革性领域中的有效性，如果我们能够像本文所做的那样，适当调节可靠性与可用性之间的权衡，语言模型就能对该领域产生巨大的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference

查看原文本刊更多论文

A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference

Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.