{"title":"Reproducible Learning of Gaussian Graphical Models via Graphical Lasso Multiple Data Splitting","authors":"Kang Hu, Danning Li, Binghui Liu","doi":"10.1007/s10114-025-3324-1","DOIUrl":null,"url":null,"abstract":"<div><p>Gaussian graphical models (GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate (FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is <i>p</i> times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis.</p></div>","PeriodicalId":50893,"journal":{"name":"Acta Mathematica Sinica-English Series","volume":"41 2","pages":"553 - 568"},"PeriodicalIF":0.8000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Mathematica Sinica-English Series","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10114-025-3324-1","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Gaussian graphical models (GGMs) are widely used as intuitive and efficient tools for data analysis in several application domains. To address the reproducibility issue of structure learning of a GGM, it is essential to control the false discovery rate (FDR) of the estimated edge set of the graph in terms of the graphical model. Hence, in recent years, the problem of GGM estimation with FDR control is receiving more and more attention. In this paper, we propose a new GGM estimation method by implementing multiple data splitting. Instead of using the node-by-node regressions to estimate each row of the precision matrix, we suggest directly estimating the entire precision matrix using the graphical Lasso in the multiple data splitting, and our calculation speed is p times faster than the previous. We show that the proposed method can asymptotically control FDR, and the proposed method has significant advantages in computational efficiency. Finally, we demonstrate the usefulness of the proposed method through a real data analysis.
期刊介绍:
Acta Mathematica Sinica, established by the Chinese Mathematical Society in 1936, is the first and the best mathematical journal in China. In 1985, Acta Mathematica Sinica is divided into English Series and Chinese Series. The English Series is a monthly journal, publishing significant research papers from all branches of pure and applied mathematics. It provides authoritative reviews of current developments in mathematical research. Contributions are invited from researchers from all over the world.