{"title":"Explicit solutions for the asymptotically-optimal bandwidth in cross-validation","authors":"Karim M Abadir, Michel Lubrano","doi":"10.1093/biomet/asae007","DOIUrl":null,"url":null,"abstract":"Summary We show that least squares cross-validation methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative thus leading to very fast computations, their integrated squared-error dominates traditional cross-validation implementations, they alleviate the notorious sample variability of cross-validation, and overcome its breakdown in the case of repeated observations. We illustrate our method with univariate and bivariate applications, of density estimation and nonparametric regressions, to a large dataset of Michigan State University academic wages and experience.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asae007","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Summary We show that least squares cross-validation methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the cross-validation criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and noniterative thus leading to very fast computations, their integrated squared-error dominates traditional cross-validation implementations, they alleviate the notorious sample variability of cross-validation, and overcome its breakdown in the case of repeated observations. We illustrate our method with univariate and bivariate applications, of density estimation and nonparametric regressions, to a large dataset of Michigan State University academic wages and experience.
期刊介绍:
Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.