{"title":"基于决定因子的snp分组及其在疾病相关基因组位点检测中的应用","authors":"Gennady Khvorykh, Andrey Khrunin","doi":"10.1093/nargab/lqaf024","DOIUrl":null,"url":null,"abstract":"<p><p>Groups of single nucleotide polymorphisms (SNPs) are more effective than individual SNPs in identifying genetic loci associated with diseases. However, an optimal method for grouping SNPs remains an open question. Here, we introduce a novel approach for SNP grouping, leveraging the determinant of linkage disequilibrium (LD) matrices as a comprehensive metric of multicollinearity. This method builds on the established use of determinants in regression analysis as an aggregate measure of variable interdependence. We proposed that SNPs be grouped by evaluating the determinant of their LD matrices, with the approach validated using both synthetic genotype-phenotype data and real-world data from genome-wide association studies (GWAS) of ischemic stroke. Application of this method identified two previously known and five novel candidate genes associated with the onset of disease. Additionally, we developed a straightforward procedure to estimate a critical parameter for the model: the minimal determinant value for an LD matrix to be considered singular. In summary, the determinant of the LD matrix serves as a robust integrative measure for assessing SNP group quality. This metric underpins a bioinformatics workflow capable of identifying genomic loci associated with disease onset, offering a valuable tool for advancing genetic association studies.</p>","PeriodicalId":33994,"journal":{"name":"NAR Genomics and Bioinformatics","volume":"7 1","pages":"lqaf024"},"PeriodicalIF":4.0000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11915498/pdf/","citationCount":"0","resultStr":"{\"title\":\"Determinant-based grouping of SNPs and its application for detecting disease-associated genomic loci.\",\"authors\":\"Gennady Khvorykh, Andrey Khrunin\",\"doi\":\"10.1093/nargab/lqaf024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Groups of single nucleotide polymorphisms (SNPs) are more effective than individual SNPs in identifying genetic loci associated with diseases. However, an optimal method for grouping SNPs remains an open question. Here, we introduce a novel approach for SNP grouping, leveraging the determinant of linkage disequilibrium (LD) matrices as a comprehensive metric of multicollinearity. This method builds on the established use of determinants in regression analysis as an aggregate measure of variable interdependence. We proposed that SNPs be grouped by evaluating the determinant of their LD matrices, with the approach validated using both synthetic genotype-phenotype data and real-world data from genome-wide association studies (GWAS) of ischemic stroke. Application of this method identified two previously known and five novel candidate genes associated with the onset of disease. Additionally, we developed a straightforward procedure to estimate a critical parameter for the model: the minimal determinant value for an LD matrix to be considered singular. In summary, the determinant of the LD matrix serves as a robust integrative measure for assessing SNP group quality. This metric underpins a bioinformatics workflow capable of identifying genomic loci associated with disease onset, offering a valuable tool for advancing genetic association studies.</p>\",\"PeriodicalId\":33994,\"journal\":{\"name\":\"NAR Genomics and Bioinformatics\",\"volume\":\"7 1\",\"pages\":\"lqaf024\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2025-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11915498/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NAR Genomics and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/nargab/lqaf024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NAR Genomics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/nargab/lqaf024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Determinant-based grouping of SNPs and its application for detecting disease-associated genomic loci.
Groups of single nucleotide polymorphisms (SNPs) are more effective than individual SNPs in identifying genetic loci associated with diseases. However, an optimal method for grouping SNPs remains an open question. Here, we introduce a novel approach for SNP grouping, leveraging the determinant of linkage disequilibrium (LD) matrices as a comprehensive metric of multicollinearity. This method builds on the established use of determinants in regression analysis as an aggregate measure of variable interdependence. We proposed that SNPs be grouped by evaluating the determinant of their LD matrices, with the approach validated using both synthetic genotype-phenotype data and real-world data from genome-wide association studies (GWAS) of ischemic stroke. Application of this method identified two previously known and five novel candidate genes associated with the onset of disease. Additionally, we developed a straightforward procedure to estimate a critical parameter for the model: the minimal determinant value for an LD matrix to be considered singular. In summary, the determinant of the LD matrix serves as a robust integrative measure for assessing SNP group quality. This metric underpins a bioinformatics workflow capable of identifying genomic loci associated with disease onset, offering a valuable tool for advancing genetic association studies.