Aryan Eftekhari, Lisa Gaedke-Merzhaeuser, D. Pasadakis, M. Bollhoefer, S. Scheidegger, O. Schenk
{"title":"Large-Scale Precision Matrix Estimation With SQUIC","authors":"Aryan Eftekhari, Lisa Gaedke-Merzhaeuser, D. Pasadakis, M. Bollhoefer, S. Scheidegger, O. Schenk","doi":"10.2139/ssrn.3904001","DOIUrl":null,"url":null,"abstract":"High-dimensional sparse precision matrix estimation is a ubiquitous task in multivariate analysis with applications that cross many disciplines. In this paper, we introduce the SQUIC package, which benefits from superior runtime performance and scalability, significantly exceeding the available state-of-the-art packages. This package is a second-order method that solves the L1--regularized maximum likelihood problem using highly optimized linear algebra subroutines, which leverage the underlying sparsity and the intrinsic parallelism in the computation. We provide two sets of numerical tests; the first one consists of didactic examples using synthetic datasets highlighting the performance and accuracy of the package, and the second one is a real-world classification problem of high dimensional medical datasets. The base algorithm is implemented in C++ with interfaces for R and Python.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PSN: Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3904001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
High-dimensional sparse precision matrix estimation is a ubiquitous task in multivariate analysis with applications that cross many disciplines. In this paper, we introduce the SQUIC package, which benefits from superior runtime performance and scalability, significantly exceeding the available state-of-the-art packages. This package is a second-order method that solves the L1--regularized maximum likelihood problem using highly optimized linear algebra subroutines, which leverage the underlying sparsity and the intrinsic parallelism in the computation. We provide two sets of numerical tests; the first one consists of didactic examples using synthetic datasets highlighting the performance and accuracy of the package, and the second one is a real-world classification problem of high dimensional medical datasets. The base algorithm is implemented in C++ with interfaces for R and Python.