{"title":"Modification of the Method for Calculating Polygenic Risks With Variation Graph","authors":"O. Kondrateva, E. Karpulevich","doi":"10.15514/ispras-2022-34(2)-15","DOIUrl":null,"url":null,"abstract":"Representation of the DNA sequence is possible in various ways. The variation graph is one of the most accurate methods that allows you to work with atypical areas and take into account all their diversity. Based on this data structure and the polygenic risk assessment method, a DNA interpretation system was built. As a result, a correlation coefficient was obtained between the path in the column responsible for a specific DNA sequence and the feature. We then compared it with a coefficient obtained by a similar method but using sequence representation using a reference genome. Such a comparison helped to evaluate the effectiveness of the representation in the form of a graph. After that, a modified method for calculating the polygenic score on the alignment data of the vg tool was built, which was also compared with existing methods. The modified method showed an improvement in the prediction of the trait.","PeriodicalId":33459,"journal":{"name":"Trudy Instituta sistemnogo programmirovaniia RAN","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trudy Instituta sistemnogo programmirovaniia RAN","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15514/ispras-2022-34(2)-15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Representation of the DNA sequence is possible in various ways. The variation graph is one of the most accurate methods that allows you to work with atypical areas and take into account all their diversity. Based on this data structure and the polygenic risk assessment method, a DNA interpretation system was built. As a result, a correlation coefficient was obtained between the path in the column responsible for a specific DNA sequence and the feature. We then compared it with a coefficient obtained by a similar method but using sequence representation using a reference genome. Such a comparison helped to evaluate the effectiveness of the representation in the form of a graph. After that, a modified method for calculating the polygenic score on the alignment data of the vg tool was built, which was also compared with existing methods. The modified method showed an improvement in the prediction of the trait.