{"title":"The importance of convexity in learning with squared loss","authors":"Wee Sun Lee, P. Bartlett, R. C. Williamson","doi":"10.1145/238061.238082","DOIUrl":null,"url":null,"abstract":"We show that if the closure of a function class under the metric induced by some probability distribution is not convex, then the sample complexity for agnostically learning with squared loss (using only hypotheses in )i s where is the probability of success and is the required accuracy. In comparison, if the class is convex and has finite pseudodimension, then the sample complexity is . If a nonconvex class has finite pseudodimension, then the sample complexity for agnostically learning the closure of the convex hull of ,i s . Hence, for agnostic learning, learning the convex hull provides better approximation capabilities with little sample complexity penalty.","PeriodicalId":13250,"journal":{"name":"IEEE Trans. Inf. Theory","volume":"53 1","pages":"1974-1980"},"PeriodicalIF":0.0000,"publicationDate":"1998-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"116","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Trans. Inf. Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/238061.238082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 116
Abstract
We show that if the closure of a function class under the metric induced by some probability distribution is not convex, then the sample complexity for agnostically learning with squared loss (using only hypotheses in )i s where is the probability of success and is the required accuracy. In comparison, if the class is convex and has finite pseudodimension, then the sample complexity is . If a nonconvex class has finite pseudodimension, then the sample complexity for agnostically learning the closure of the convex hull of ,i s . Hence, for agnostic learning, learning the convex hull provides better approximation capabilities with little sample complexity penalty.