{"title":"Learning Deep Kernels for Non-Parametric Independence Testing","authors":"Nathaniel Xu, Feng Liu, Danica J. Sutherland","doi":"arxiv-2409.06890","DOIUrl":null,"url":null,"abstract":"The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool for\nnonparametric detection of dependence between random variables. It crucially\ndepends, however, on the selection of reasonable kernels; commonly-used choices\nlike the Gaussian kernel, or the kernel that yields the distance covariance,\nare sufficient only for amply sized samples from data distributions with\nrelatively simple forms of dependence. We propose a scheme for selecting the\nkernels used in an HSIC-based independence test, based on maximizing an\nestimate of the asymptotic test power. We prove that maximizing this estimate\nindeed approximately maximizes the true power of the test, and demonstrate that\nour learned kernels can identify forms of structured dependence between random\nvariables in various experiments.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"100 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool for
nonparametric detection of dependence between random variables. It crucially
depends, however, on the selection of reasonable kernels; commonly-used choices
like the Gaussian kernel, or the kernel that yields the distance covariance,
are sufficient only for amply sized samples from data distributions with
relatively simple forms of dependence. We propose a scheme for selecting the
kernels used in an HSIC-based independence test, based on maximizing an
estimate of the asymptotic test power. We prove that maximizing this estimate
indeed approximately maximizes the true power of the test, and demonstrate that
our learned kernels can identify forms of structured dependence between random
variables in various experiments.