Stefan Nikolić, Dmitry I Ignatov, Gennady V Khvorykh, Svetlana A Limborska, Andrey V Khrunin
{"title":"Genome-wide association studies of ischemic stroke based on interpretable machine learning.","authors":"Stefan Nikolić, Dmitry I Ignatov, Gennady V Khvorykh, Svetlana A Limborska, Andrey V Khrunin","doi":"10.7717/peerj-cs.2454","DOIUrl":null,"url":null,"abstract":"<p><p>Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied. These methods were applied to real genotypic data of sick and healthy individuals of European ancestry obtained from the Database of Genotypes and Phenotypes (5,581 individuals, 883,749 single nucleotide polymorphisms). Finally, 131 genes were identified as candidates for association with the onset of IS. <i>UBQLN1</i>, <i>TRPS1</i>, and <i>MUSK</i> were previously described as associated with the course of IS in model animals. <i>ACOT11</i> taking part in metabolism of fatty acids was shown for the first time to be associated with IS. The identified genes were compared with genes from the Illuminating Druggable Genome project. The product of <i>GPR26</i> representing the G-coupled protein receptor can be considered as a therapeutic target for stroke prevention. The approaches presented in this research can be used to reprocess GWAS datasets from other diseases.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2454"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11623107/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2454","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Despite the identification of several dozen genetic loci associated with ischemic stroke (IS), the genetic bases of this disease remain largely unexplored. In this research we present the results of genome-wide association studies (GWAS) based on classical statistical testing and machine learning algorithms (logistic regression, gradient boosting on decision trees, and tabular deep learning model TabNet). To build a consensus on the results obtained by different techniques, the Pareto-Optimal solution was proposed and applied. These methods were applied to real genotypic data of sick and healthy individuals of European ancestry obtained from the Database of Genotypes and Phenotypes (5,581 individuals, 883,749 single nucleotide polymorphisms). Finally, 131 genes were identified as candidates for association with the onset of IS. UBQLN1, TRPS1, and MUSK were previously described as associated with the course of IS in model animals. ACOT11 taking part in metabolism of fatty acids was shown for the first time to be associated with IS. The identified genes were compared with genes from the Illuminating Druggable Genome project. The product of GPR26 representing the G-coupled protein receptor can be considered as a therapeutic target for stroke prevention. The approaches presented in this research can be used to reprocess GWAS datasets from other diseases.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.