{"title":"Benign Overfitting for $α$ Sub-exponential Input","authors":"Kota Okudo, Kei Kobayashi","doi":"arxiv-2409.00733","DOIUrl":null,"url":null,"abstract":"This paper investigates the phenomenon of benign overfitting in binary\nclassification problems with heavy-tailed input distributions. We extend the\nanalysis of maximum margin classifiers to $\\alpha$ sub-exponential\ndistributions, where $\\alpha \\in (0,2]$, generalizing previous work that\nfocused on sub-gaussian inputs. Our main result provides generalization error\nbounds for linear classifiers trained using gradient descent on unregularized\nlogistic loss in this heavy-tailed setting. We prove that under certain\nconditions on the dimensionality $p$ and feature vector magnitude $\\|\\mu\\|$,\nthe misclassification error of the maximum margin classifier asymptotically\napproaches the noise level. This work contributes to the understanding of\nbenign overfitting in more robust distribution settings and demonstrates that\nthe phenomenon persists even with heavier-tailed inputs than previously\nstudied.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00733","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper investigates the phenomenon of benign overfitting in binary
classification problems with heavy-tailed input distributions. We extend the
analysis of maximum margin classifiers to $\alpha$ sub-exponential
distributions, where $\alpha \in (0,2]$, generalizing previous work that
focused on sub-gaussian inputs. Our main result provides generalization error
bounds for linear classifiers trained using gradient descent on unregularized
logistic loss in this heavy-tailed setting. We prove that under certain
conditions on the dimensionality $p$ and feature vector magnitude $\|\mu\|$,
the misclassification error of the maximum margin classifier asymptotically
approaches the noise level. This work contributes to the understanding of
benign overfitting in more robust distribution settings and demonstrates that
the phenomenon persists even with heavier-tailed inputs than previously
studied.