{"title":"可扩展的岭杠杆得分抽样Nyström方法","authors":"Farah Cherfaoui, H. Kadri, L. Ralaivola","doi":"10.1109/icassp43922.2022.9747039","DOIUrl":null,"url":null,"abstract":"The Nyström method, known as an efficient technique for approximating Gram matrices, builds upon a small subset of the data called landmarks, whose choice impacts the quality of the approximated Gram matrix. Various sampling methods have been proposed in the literature to choose such a subset, among which some based on ridge Leverage scores, which come with good theoretical and practical results. Nevertheless, direct computation of ridge leverage scores has an Θ(n3) computation cost if n is the number of data, which is prohibitive when n is large. To tackle this problem, we here propose a Θ(n) divide-and-conquer (DAC) method to approximate ridge leverage scores and we provide theoretical guarantees and empirical results regarding their ability to blend with the Nyström approximation strategy. Our experimental results show that the proposed approximate leverage score sampling scheme achieves a good trade-off between predictive performance and running time.","PeriodicalId":272439,"journal":{"name":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Ridge Leverage Score Sampling for the Nyström Method\",\"authors\":\"Farah Cherfaoui, H. Kadri, L. Ralaivola\",\"doi\":\"10.1109/icassp43922.2022.9747039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Nyström method, known as an efficient technique for approximating Gram matrices, builds upon a small subset of the data called landmarks, whose choice impacts the quality of the approximated Gram matrix. Various sampling methods have been proposed in the literature to choose such a subset, among which some based on ridge Leverage scores, which come with good theoretical and practical results. Nevertheless, direct computation of ridge leverage scores has an Θ(n3) computation cost if n is the number of data, which is prohibitive when n is large. To tackle this problem, we here propose a Θ(n) divide-and-conquer (DAC) method to approximate ridge leverage scores and we provide theoretical guarantees and empirical results regarding their ability to blend with the Nyström approximation strategy. Our experimental results show that the proposed approximate leverage score sampling scheme achieves a good trade-off between predictive performance and running time.\",\"PeriodicalId\":272439,\"journal\":{\"name\":\"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icassp43922.2022.9747039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icassp43922.2022.9747039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalable Ridge Leverage Score Sampling for the Nyström Method
The Nyström method, known as an efficient technique for approximating Gram matrices, builds upon a small subset of the data called landmarks, whose choice impacts the quality of the approximated Gram matrix. Various sampling methods have been proposed in the literature to choose such a subset, among which some based on ridge Leverage scores, which come with good theoretical and practical results. Nevertheless, direct computation of ridge leverage scores has an Θ(n3) computation cost if n is the number of data, which is prohibitive when n is large. To tackle this problem, we here propose a Θ(n) divide-and-conquer (DAC) method to approximate ridge leverage scores and we provide theoretical guarantees and empirical results regarding their ability to blend with the Nyström approximation strategy. Our experimental results show that the proposed approximate leverage score sampling scheme achieves a good trade-off between predictive performance and running time.