{"title":"基于快速交替最小二乘的矩阵补全和低秩奇异值分解","authors":"T. Hastie, R. Mazumder, J. Lee, R. Zadeh","doi":"10.5555/2789272.2912106","DOIUrl":null,"url":null,"abstract":"The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclear-norm-regularized matrix approximation (Candès and Tao, 2009; Mazumder et al., 2010), and maximum-margin matrix factorization (Srebro et al., 2005). These two procedures are in some cases solving equivalent problems, but with quite different algorithms. In this article we bring the two approaches together, leading to an efficient algorithm for large matrix factorization and completion that outperforms both of these. We develop a software package softlmpute in R for implementing our approaches, and a distributed version for very large matrices using the Spark cluster programming environment.","PeriodicalId":314696,"journal":{"name":"Journal of machine learning research : JMLR","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"433","resultStr":"{\"title\":\"Matrix completion and low-rank SVD via fast alternating least squares\",\"authors\":\"T. Hastie, R. Mazumder, J. Lee, R. Zadeh\",\"doi\":\"10.5555/2789272.2912106\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclear-norm-regularized matrix approximation (Candès and Tao, 2009; Mazumder et al., 2010), and maximum-margin matrix factorization (Srebro et al., 2005). These two procedures are in some cases solving equivalent problems, but with quite different algorithms. In this article we bring the two approaches together, leading to an efficient algorithm for large matrix factorization and completion that outperforms both of these. We develop a software package softlmpute in R for implementing our approaches, and a distributed version for very large matrices using the Spark cluster programming environment.\",\"PeriodicalId\":314696,\"journal\":{\"name\":\"Journal of machine learning research : JMLR\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"433\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of machine learning research : JMLR\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5555/2789272.2912106\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of machine learning research : JMLR","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/2789272.2912106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 433
摘要
矩阵补全问题吸引了很多关注,主要是因为著名的Netflix竞争。解决这个问题的两种流行方法是核范数正则化矩阵近似(cand和Tao, 2009;Mazumder et al., 2010)和最大边际矩阵分解(Srebro et al., 2005)。这两种程序在某些情况下解决等价的问题,但使用完全不同的算法。在本文中,我们将这两种方法结合在一起,从而产生一种有效的大矩阵分解和补全算法,其性能优于这两种方法。我们在R中开发了一个软件包softmpute来实现我们的方法,并使用Spark集群编程环境开发了一个用于非常大的矩阵的分布式版本。
Matrix completion and low-rank SVD via fast alternating least squares
The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclear-norm-regularized matrix approximation (Candès and Tao, 2009; Mazumder et al., 2010), and maximum-margin matrix factorization (Srebro et al., 2005). These two procedures are in some cases solving equivalent problems, but with quite different algorithms. In this article we bring the two approaches together, leading to an efficient algorithm for large matrix factorization and completion that outperforms both of these. We develop a software package softlmpute in R for implementing our approaches, and a distributed version for very large matrices using the Spark cluster programming environment.