针对异质分组数据的最小回归学习

Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu
{"title":"针对异质分组数据的最小回归学习","authors":"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu","doi":"arxiv-2405.01709","DOIUrl":null,"url":null,"abstract":"Modern complex datasets often consist of various sub-populations. To develop\nrobust and generalizable methods in the presence of sub-population\nheterogeneity, it is important to guarantee a uniform learning performance\ninstead of an average one. In many applications, prior information is often\navailable on which sub-population or group the data points belong to. Given the\nobserved groups of data, we develop a min-max-regret (MMR) learning framework\nfor general supervised learning, which targets to minimize the worst-group\nregret. Motivated from the regret-based decision theoretic framework, the\nproposed MMR is distinguished from the value-based or risk-based robust\nlearning methods in the existing literature. The regret criterion features\nseveral robustness and invariance properties simultaneously. In terms of\ngeneralizability, we develop the theoretical guarantee for the worst-case\nregret over a super-population of the meta data, which incorporates the\nobserved sub-populations, their mixtures, as well as other unseen\nsub-populations that could be approximated by the observed ones. We demonstrate\nthe effectiveness of our method through extensive simulation studies and an\napplication to kidney transplantation data from hundreds of transplant centers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"152 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimax Regret Learning for Data with Heterogeneous Subgroups\",\"authors\":\"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu\",\"doi\":\"arxiv-2405.01709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern complex datasets often consist of various sub-populations. To develop\\nrobust and generalizable methods in the presence of sub-population\\nheterogeneity, it is important to guarantee a uniform learning performance\\ninstead of an average one. In many applications, prior information is often\\navailable on which sub-population or group the data points belong to. Given the\\nobserved groups of data, we develop a min-max-regret (MMR) learning framework\\nfor general supervised learning, which targets to minimize the worst-group\\nregret. Motivated from the regret-based decision theoretic framework, the\\nproposed MMR is distinguished from the value-based or risk-based robust\\nlearning methods in the existing literature. The regret criterion features\\nseveral robustness and invariance properties simultaneously. In terms of\\ngeneralizability, we develop the theoretical guarantee for the worst-case\\nregret over a super-population of the meta data, which incorporates the\\nobserved sub-populations, their mixtures, as well as other unseen\\nsub-populations that could be approximated by the observed ones. We demonstrate\\nthe effectiveness of our method through extensive simulation studies and an\\napplication to kidney transplantation data from hundreds of transplant centers.\",\"PeriodicalId\":501330,\"journal\":{\"name\":\"arXiv - MATH - Statistics Theory\",\"volume\":\"152 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.01709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

现代复杂数据集通常由各种子群组成。要想在存在子群异质性的情况下开发出稳健、可推广的方法,必须保证统一的学习性能,而不是平均性能。在许多应用中,数据点属于哪个子群或组,往往可以获得先验信息。鉴于观察到的数据组,我们开发了一种用于一般监督学习的最小-最大-遗憾(MMR)学习框架,其目标是最小化最差组遗憾。受基于遗憾的决策理论框架的启发,我们提出的 MMR 有别于现有文献中基于价值或风险的鲁棒学习方法。遗憾准则同时具有稳健性和不变性等特征。在通用性方面,我们从理论上保证了元数据超群的最差后悔值,超群包括观测到的子群、它们的混合物以及可以用观测到的子群近似的其他未知子群。我们通过大量的模拟研究和对数百个移植中心的肾移植数据的应用,证明了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Minimax Regret Learning for Data with Heterogeneous Subgroups
Modern complex datasets often consist of various sub-populations. To develop robust and generalizable methods in the presence of sub-population heterogeneity, it is important to guarantee a uniform learning performance instead of an average one. In many applications, prior information is often available on which sub-population or group the data points belong to. Given the observed groups of data, we develop a min-max-regret (MMR) learning framework for general supervised learning, which targets to minimize the worst-group regret. Motivated from the regret-based decision theoretic framework, the proposed MMR is distinguished from the value-based or risk-based robust learning methods in the existing literature. The regret criterion features several robustness and invariance properties simultaneously. In terms of generalizability, we develop the theoretical guarantee for the worst-case regret over a super-population of the meta data, which incorporates the observed sub-populations, their mixtures, as well as other unseen sub-populations that could be approximated by the observed ones. We demonstrate the effectiveness of our method through extensive simulation studies and an application to kidney transplantation data from hundreds of transplant centers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信