针对概率稳健可扩展机器学习分类的共形预测

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli
{"title":"针对概率稳健可扩展机器学习分类的共形预测","authors":"Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli","doi":"10.1007/s10994-024-06571-6","DOIUrl":null,"url":null,"abstract":"<p>Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined <span>\\(\\varepsilon\\)</span> error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"72 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Conformal predictions for probabilistically robust scalable machine learning classification\",\"authors\":\"Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli\",\"doi\":\"10.1007/s10994-024-06571-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined <span>\\\\(\\\\varepsilon\\\\)</span> error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.</p>\",\"PeriodicalId\":49900,\"journal\":{\"name\":\"Machine Learning\",\"volume\":\"72 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10994-024-06571-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06571-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

共形预测使我们有可能定义可靠、稳健的学习算法。但是,它们本质上是一种评估算法是否足以在实践中使用的方法。为了从设计之初就定义可靠的分类学习框架,我们引入了可扩展分类器的概念,通过将其与统计秩理论和概率学习理论联系起来,对经典分类器的概念进行了概括。在本文中,我们分析了可扩展分类器和保形预测之间的相似性,引入了得分函数的新定义,并定义了一个特殊的输入变量集--保形安全集,它可以识别输入空间中满足误差覆盖保证的模式,即观察到属于该集的点的错误(可能不安全)标签的概率被预定义的\(\varepsilon\)误差水平所限制。我们通过在网络安全中识别 DNS 隧道攻击的应用,展示了这一框架的实际意义。我们的工作有助于开发概率上稳健可靠的机器学习模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Conformal predictions for probabilistically robust scalable machine learning classification

Conformal predictions for probabilistically robust scalable machine learning classification

Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined \(\varepsilon\) error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine Learning
Machine Learning 工程技术-计算机:人工智能
CiteScore
11.00
自引率
2.70%
发文量
162
审稿时长
3 months
期刊介绍: Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信