鲁棒机器学习的一般风险度量

IF 1.4 Q2 MATHEMATICS, APPLIED

Foundations of data science (Springfield, Mo.) Pub Date : 2019-04-26 DOI:10.3934/fods.2019011

É. Chouzenoux, Henri G'erard, J. Pesquet

{"title":"鲁棒机器学习的一般风险度量","authors":"É. Chouzenoux, Henri G'erard, J. Pesquet","doi":"10.3934/fods.2019011","DOIUrl":null,"url":null,"abstract":"A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2019-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"General risk measures for robust machine learning\",\"authors\":\"É. Chouzenoux, Henri G'erard, J. Pesquet\",\"doi\":\"10.3934/fods.2019011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\\\\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.\",\"PeriodicalId\":73054,\"journal\":{\"name\":\"Foundations of data science (Springfield, Mo.)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2019-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations of data science (Springfield, Mo.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3934/fods.2019011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of data science (Springfield, Mo.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3934/fods.2019011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 6

摘要

一系列广泛的机器学习问题被公式化为在一些参数空间上凸损失函数的期望的最小化。由于感兴趣数据的概率分布通常是未知的，因此通常是根据训练集来估计的，这可能会导致样本外性能较差。在这项工作中，我们通过使用量化金融中开发的风险度量框架，为这个问题带来了新的见解。我们证明了在适当的假设下，原始的最小-最大问题可以被重新定义为凸最小化问题。我们讨论了鲁棒公式的几个重要例子，特别是通过定义基于$\varphi$-differences和Wasserstein度量的模糊集。我们还提出了一种有效的算法来解决涉及复杂凸约束的相应凸优化问题。通过仿真实例，我们证明了该算法在真实数据集上的良好扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

General risk measures for robust machine learning

A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Foundations of data science (Springfield, Mo.)

CiteScore

3.30

自引率

0.00%

发文量