Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu
{"title":"针对异质分组数据的最小回归学习","authors":"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu","doi":"arxiv-2405.01709","DOIUrl":null,"url":null,"abstract":"Modern complex datasets often consist of various sub-populations. To develop\nrobust and generalizable methods in the presence of sub-population\nheterogeneity, it is important to guarantee a uniform learning performance\ninstead of an average one. In many applications, prior information is often\navailable on which sub-population or group the data points belong to. Given the\nobserved groups of data, we develop a min-max-regret (MMR) learning framework\nfor general supervised learning, which targets to minimize the worst-group\nregret. Motivated from the regret-based decision theoretic framework, the\nproposed MMR is distinguished from the value-based or risk-based robust\nlearning methods in the existing literature. The regret criterion features\nseveral robustness and invariance properties simultaneously. In terms of\ngeneralizability, we develop the theoretical guarantee for the worst-case\nregret over a super-population of the meta data, which incorporates the\nobserved sub-populations, their mixtures, as well as other unseen\nsub-populations that could be approximated by the observed ones. We demonstrate\nthe effectiveness of our method through extensive simulation studies and an\napplication to kidney transplantation data from hundreds of transplant centers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"152 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimax Regret Learning for Data with Heterogeneous Subgroups\",\"authors\":\"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu\",\"doi\":\"arxiv-2405.01709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern complex datasets often consist of various sub-populations. To develop\\nrobust and generalizable methods in the presence of sub-population\\nheterogeneity, it is important to guarantee a uniform learning performance\\ninstead of an average one. In many applications, prior information is often\\navailable on which sub-population or group the data points belong to. Given the\\nobserved groups of data, we develop a min-max-regret (MMR) learning framework\\nfor general supervised learning, which targets to minimize the worst-group\\nregret. Motivated from the regret-based decision theoretic framework, the\\nproposed MMR is distinguished from the value-based or risk-based robust\\nlearning methods in the existing literature. The regret criterion features\\nseveral robustness and invariance properties simultaneously. In terms of\\ngeneralizability, we develop the theoretical guarantee for the worst-case\\nregret over a super-population of the meta data, which incorporates the\\nobserved sub-populations, their mixtures, as well as other unseen\\nsub-populations that could be approximated by the observed ones. We demonstrate\\nthe effectiveness of our method through extensive simulation studies and an\\napplication to kidney transplantation data from hundreds of transplant centers.\",\"PeriodicalId\":501330,\"journal\":{\"name\":\"arXiv - MATH - Statistics Theory\",\"volume\":\"152 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.01709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Minimax Regret Learning for Data with Heterogeneous Subgroups
Modern complex datasets often consist of various sub-populations. To develop
robust and generalizable methods in the presence of sub-population
heterogeneity, it is important to guarantee a uniform learning performance
instead of an average one. In many applications, prior information is often
available on which sub-population or group the data points belong to. Given the
observed groups of data, we develop a min-max-regret (MMR) learning framework
for general supervised learning, which targets to minimize the worst-group
regret. Motivated from the regret-based decision theoretic framework, the
proposed MMR is distinguished from the value-based or risk-based robust
learning methods in the existing literature. The regret criterion features
several robustness and invariance properties simultaneously. In terms of
generalizability, we develop the theoretical guarantee for the worst-case
regret over a super-population of the meta data, which incorporates the
observed sub-populations, their mixtures, as well as other unseen
sub-populations that could be approximated by the observed ones. We demonstrate
the effectiveness of our method through extensive simulation studies and an
application to kidney transplantation data from hundreds of transplant centers.