Optimal Regularization for a Data Source

IF 2.5 1区 数学 Q2 COMPUTER SCIENCE, THEORY & METHODS
Oscar Leong, Eliza O’ Reilly, Yong Sheng Soh, Venkat Chandrasekaran
{"title":"Optimal Regularization for a Data Source","authors":"Oscar Leong, Eliza O’ Reilly, Yong Sheng Soh, Venkat Chandrasekaran","doi":"10.1007/s10208-025-09693-y","DOIUrl":null,"url":null,"abstract":"<p>In optimization-based approaches to inverse problems and to statistical estimation, it is common to augment criteria that enforce data fidelity with a regularizer that promotes desired structural properties in the solution. The choice of a suitable regularizer is typically driven by a combination of prior domain information and computational considerations. Convex regularizers are attractive computationally but they are limited in the types of structure they can promote. On the other hand, nonconvex regularizers are more flexible in the forms of structure they can promote and they have showcased strong empirical performance in some applications, but they come with the computational challenge of solving the associated optimization problems. In this paper, we seek a systematic understanding of the power and the limitations of convex regularization by investigating the following questions: Given a distribution, what is the optimal regularizer for data drawn from the distribution? What properties of a data source govern whether the optimal regularizer is convex? We address these questions for the class of regularizers specified by functionals that are continuous, positively homogeneous, and positive away from the origin. We say that a regularizer is optimal for a data distribution if the Gibbs density with energy given by the regularizer maximizes the population likelihood (or equivalently, minimizes cross-entropy loss) over all regularizer-induced Gibbs densities. As the regularizers we consider are in one-to-one correspondence with star bodies, we leverage dual Brunn-Minkowski theory to show that a radial function derived from a data distribution is akin to a “computational sufficient statistic” as it is the key quantity for identifying optimal regularizers and for assessing the amenability of a data source to convex regularization. Using tools such as <span>\\(\\Gamma \\)</span>-convergence from variational analysis, we show that our results are robust in the sense that the optimal regularizers for a sample drawn from a distribution converge to their population counterparts as the sample size grows large. Finally, we give generalization guarantees for various families of star bodies that recover previous results for polyhedral regularizers (i.e., dictionary learning) and lead to new ones for a variety of classes of star bodies. </p>","PeriodicalId":55151,"journal":{"name":"Foundations of Computational Mathematics","volume":"48 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of Computational Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10208-025-09693-y","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

In optimization-based approaches to inverse problems and to statistical estimation, it is common to augment criteria that enforce data fidelity with a regularizer that promotes desired structural properties in the solution. The choice of a suitable regularizer is typically driven by a combination of prior domain information and computational considerations. Convex regularizers are attractive computationally but they are limited in the types of structure they can promote. On the other hand, nonconvex regularizers are more flexible in the forms of structure they can promote and they have showcased strong empirical performance in some applications, but they come with the computational challenge of solving the associated optimization problems. In this paper, we seek a systematic understanding of the power and the limitations of convex regularization by investigating the following questions: Given a distribution, what is the optimal regularizer for data drawn from the distribution? What properties of a data source govern whether the optimal regularizer is convex? We address these questions for the class of regularizers specified by functionals that are continuous, positively homogeneous, and positive away from the origin. We say that a regularizer is optimal for a data distribution if the Gibbs density with energy given by the regularizer maximizes the population likelihood (or equivalently, minimizes cross-entropy loss) over all regularizer-induced Gibbs densities. As the regularizers we consider are in one-to-one correspondence with star bodies, we leverage dual Brunn-Minkowski theory to show that a radial function derived from a data distribution is akin to a “computational sufficient statistic” as it is the key quantity for identifying optimal regularizers and for assessing the amenability of a data source to convex regularization. Using tools such as \(\Gamma \)-convergence from variational analysis, we show that our results are robust in the sense that the optimal regularizers for a sample drawn from a distribution converge to their population counterparts as the sample size grows large. Finally, we give generalization guarantees for various families of star bodies that recover previous results for polyhedral regularizers (i.e., dictionary learning) and lead to new ones for a variety of classes of star bodies.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Foundations of Computational Mathematics
Foundations of Computational Mathematics 数学-计算机:理论方法
CiteScore
6.90
自引率
3.30%
发文量
46
审稿时长
>12 weeks
期刊介绍: Foundations of Computational Mathematics (FoCM) will publish research and survey papers of the highest quality which further the understanding of the connections between mathematics and computation. The journal aims to promote the exploration of all fundamental issues underlying the creative tension among mathematics, computer science and application areas unencumbered by any external criteria such as the pressure for applications. The journal will thus serve an increasingly important and applicable area of mathematics. The journal hopes to further the understanding of the deep relationships between mathematical theory: analysis, topology, geometry and algebra, and the computational processes as they are evolving in tandem with the modern computer. With its distinguished editorial board selecting papers of the highest quality and interest from the international community, FoCM hopes to influence both mathematics and computation. Relevance to applications will not constitute a requirement for the publication of articles. The journal does not accept code for review however authors who have code/data related to the submission should include a weblink to the repository where the data/code is stored.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信