{"title":"Determinant-based grouping of SNPs and its application for detecting disease-associated genomic loci.","authors":"Gennady Khvorykh, Andrey Khrunin","doi":"10.1093/nargab/lqaf024","DOIUrl":null,"url":null,"abstract":"<p><p>Groups of single nucleotide polymorphisms (SNPs) are more effective than individual SNPs in identifying genetic loci associated with diseases. However, an optimal method for grouping SNPs remains an open question. Here, we introduce a novel approach for SNP grouping, leveraging the determinant of linkage disequilibrium (LD) matrices as a comprehensive metric of multicollinearity. This method builds on the established use of determinants in regression analysis as an aggregate measure of variable interdependence. We proposed that SNPs be grouped by evaluating the determinant of their LD matrices, with the approach validated using both synthetic genotype-phenotype data and real-world data from genome-wide association studies (GWAS) of ischemic stroke. Application of this method identified two previously known and five novel candidate genes associated with the onset of disease. Additionally, we developed a straightforward procedure to estimate a critical parameter for the model: the minimal determinant value for an LD matrix to be considered singular. In summary, the determinant of the LD matrix serves as a robust integrative measure for assessing SNP group quality. This metric underpins a bioinformatics workflow capable of identifying genomic loci associated with disease onset, offering a valuable tool for advancing genetic association studies.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf024"},"PeriodicalIF":4.0000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11915498/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Groups of single nucleotide polymorphisms (SNPs) are more effective than individual SNPs in identifying genetic loci associated with diseases. However, an optimal method for grouping SNPs remains an open question. Here, we introduce a novel approach for SNP grouping, leveraging the determinant of linkage disequilibrium (LD) matrices as a comprehensive metric of multicollinearity. This method builds on the established use of determinants in regression analysis as an aggregate measure of variable interdependence. We proposed that SNPs be grouped by evaluating the determinant of their LD matrices, with the approach validated using both synthetic genotype-phenotype data and real-world data from genome-wide association studies (GWAS) of ischemic stroke. Application of this method identified two previously known and five novel candidate genes associated with the onset of disease. Additionally, we developed a straightforward procedure to estimate a critical parameter for the model: the minimal determinant value for an LD matrix to be considered singular. In summary, the determinant of the LD matrix serves as a robust integrative measure for assessing SNP group quality. This metric underpins a bioinformatics workflow capable of identifying genomic loci associated with disease onset, offering a valuable tool for advancing genetic association studies.