{"title":"再现核Hilbert空间中的投影发散性:渐近正态性、分块和切片估计以及计算效率","authors":"Yilin Zhang, Liping Zhu","doi":"10.1016/j.jmva.2023.105204","DOIUrl":null,"url":null,"abstract":"<div><p><span>We introduce projection divergence in the reproducing kernel Hilbert space to test for statistical independence and measure the degree of nonlinear dependence. We suggest a slicing procedure to estimate the kernel projection divergence, which divides a random sample of size </span><span><math><mi>n</mi></math></span> into <span><math><mi>H</mi></math></span> slices, each of size <span><math><mi>c</mi></math></span>. The entire procedure has the complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>n</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span>, which is prohibitive if <span><math><mi>n</mi></math></span> is extremely large. To alleviate computational complexity, we implement this slicing procedure together with a block-wise estimation, which divides the whole sample into <span><math><mi>B</mi></math></span> blocks, each of size <span><math><mi>d</mi></math></span>. This block-wise and slicing estimation has the complexity of <span><math><mrow><mi>O</mi><mrow><mo>{</mo><mi>n</mi><mrow><mo>(</mo><mi>c</mi><mo>+</mo><mi>d</mi><mo>+</mo><mo>log</mo><mi>n</mi><mo>)</mo></mrow><mo>}</mo></mrow></mrow></math></span>, which reduces the computational complexity substantially if <span><math><mi>c</mi></math></span> and <span><math><mi>d</mi></math></span> are relatively small. The resultant estimation is asymptotically normal and has the convergence rate of <span><math><msup><mrow><mrow><mo>{</mo><mi>n</mi><mrow><mo>(</mo><mi>c</mi><mi>d</mi><mo>)</mo></mrow><mo>/</mo><mrow><mo>(</mo><mi>c</mi><mo>+</mo><mi>d</mi><mo>)</mo></mrow><mo>}</mo></mrow></mrow><mrow><mo>−</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup></math></span><span>. More importantly, this block-wise implementation has the same asymptotic properties as the naive slicing estimation, if </span><span><math><mi>c</mi></math></span> is relatively small, indicating that the block-wise implementation does not result in power loss in independence tests. We demonstrate the computational efficiencies and theoretical properties of this block-wise and slicing estimation through simulations and an application to psychological datasets.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"197 ","pages":"Article 105204"},"PeriodicalIF":1.4000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Projection divergence in the reproducing kernel Hilbert space: Asymptotic normality, block-wise and slicing estimation, and computational efficiency\",\"authors\":\"Yilin Zhang, Liping Zhu\",\"doi\":\"10.1016/j.jmva.2023.105204\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span>We introduce projection divergence in the reproducing kernel Hilbert space to test for statistical independence and measure the degree of nonlinear dependence. We suggest a slicing procedure to estimate the kernel projection divergence, which divides a random sample of size </span><span><math><mi>n</mi></math></span> into <span><math><mi>H</mi></math></span> slices, each of size <span><math><mi>c</mi></math></span>. The entire procedure has the complexity of <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>n</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span>, which is prohibitive if <span><math><mi>n</mi></math></span> is extremely large. To alleviate computational complexity, we implement this slicing procedure together with a block-wise estimation, which divides the whole sample into <span><math><mi>B</mi></math></span> blocks, each of size <span><math><mi>d</mi></math></span>. This block-wise and slicing estimation has the complexity of <span><math><mrow><mi>O</mi><mrow><mo>{</mo><mi>n</mi><mrow><mo>(</mo><mi>c</mi><mo>+</mo><mi>d</mi><mo>+</mo><mo>log</mo><mi>n</mi><mo>)</mo></mrow><mo>}</mo></mrow></mrow></math></span>, which reduces the computational complexity substantially if <span><math><mi>c</mi></math></span> and <span><math><mi>d</mi></math></span> are relatively small. The resultant estimation is asymptotically normal and has the convergence rate of <span><math><msup><mrow><mrow><mo>{</mo><mi>n</mi><mrow><mo>(</mo><mi>c</mi><mi>d</mi><mo>)</mo></mrow><mo>/</mo><mrow><mo>(</mo><mi>c</mi><mo>+</mo><mi>d</mi><mo>)</mo></mrow><mo>}</mo></mrow></mrow><mrow><mo>−</mo><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup></math></span><span>. More importantly, this block-wise implementation has the same asymptotic properties as the naive slicing estimation, if </span><span><math><mi>c</mi></math></span> is relatively small, indicating that the block-wise implementation does not result in power loss in independence tests. We demonstrate the computational efficiencies and theoretical properties of this block-wise and slicing estimation through simulations and an application to psychological datasets.</p></div>\",\"PeriodicalId\":16431,\"journal\":{\"name\":\"Journal of Multivariate Analysis\",\"volume\":\"197 \",\"pages\":\"Article 105204\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Multivariate Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0047259X23000507\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X23000507","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Projection divergence in the reproducing kernel Hilbert space: Asymptotic normality, block-wise and slicing estimation, and computational efficiency
We introduce projection divergence in the reproducing kernel Hilbert space to test for statistical independence and measure the degree of nonlinear dependence. We suggest a slicing procedure to estimate the kernel projection divergence, which divides a random sample of size into slices, each of size . The entire procedure has the complexity of , which is prohibitive if is extremely large. To alleviate computational complexity, we implement this slicing procedure together with a block-wise estimation, which divides the whole sample into blocks, each of size . This block-wise and slicing estimation has the complexity of , which reduces the computational complexity substantially if and are relatively small. The resultant estimation is asymptotically normal and has the convergence rate of . More importantly, this block-wise implementation has the same asymptotic properties as the naive slicing estimation, if is relatively small, indicating that the block-wise implementation does not result in power loss in independence tests. We demonstrate the computational efficiencies and theoretical properties of this block-wise and slicing estimation through simulations and an application to psychological datasets.
期刊介绍:
Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data.
The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of
Copula modeling
Functional data analysis
Graphical modeling
High-dimensional data analysis
Image analysis
Multivariate extreme-value theory
Sparse modeling
Spatial statistics.