I. V. Bezdvornykh, N. A. Cherkasov, A. A. Kanapin, A. A. Samsonova
{"title":"寻找与基因组结构变异相关的测序信号异常","authors":"I. V. Bezdvornykh, N. A. Cherkasov, A. A. Kanapin, A. A. Samsonova","doi":"10.1134/S0006350923050056","DOIUrl":null,"url":null,"abstract":"<p>Genomic structural variations (SVs) are among the main sources of genetic diversity. Structural variants as mutagens may significantly affect human health, causing hereditary diseases and cancers. Existing methods analyze high-throughput sequencing data to find structural variants. Despite substantial progress in their development, the methods still fail to detect structural variations with an accuracy sufficient for their use in diagnosis. Analysis of the sequencing coverage signal (i.e., the number of aligned sequencing reads for every point of a genome) holds the new potential for designing approaches to structural variation detection and can be used as time-series analysis. A method to detect repetitive patterns in the coverage signal was developed based on the time series-assessing algorithms KNN (K-nearest neighbor) and SAX (Symbolic Aggregation Approximation). Using the rich dataset encompassing the full genomes of 911 individuals with different ethnic backgrounds from the Human Genome Diversity Project, generalized patterns of the coverage signal were constructed for regions in the vicinity of breakpoints corresponding to various structural variant types. The patterns were used to develop a software package for fast detection of anomalies in the coverage signal.</p>","PeriodicalId":493,"journal":{"name":"Biophysics","volume":"68 5","pages":"755 - 759"},"PeriodicalIF":4.0330,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Searching for Sequencing Signal Anomalies Associated with Genomic Structural Variations\",\"authors\":\"I. V. Bezdvornykh, N. A. Cherkasov, A. A. Kanapin, A. A. Samsonova\",\"doi\":\"10.1134/S0006350923050056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Genomic structural variations (SVs) are among the main sources of genetic diversity. Structural variants as mutagens may significantly affect human health, causing hereditary diseases and cancers. Existing methods analyze high-throughput sequencing data to find structural variants. Despite substantial progress in their development, the methods still fail to detect structural variations with an accuracy sufficient for their use in diagnosis. Analysis of the sequencing coverage signal (i.e., the number of aligned sequencing reads for every point of a genome) holds the new potential for designing approaches to structural variation detection and can be used as time-series analysis. A method to detect repetitive patterns in the coverage signal was developed based on the time series-assessing algorithms KNN (K-nearest neighbor) and SAX (Symbolic Aggregation Approximation). Using the rich dataset encompassing the full genomes of 911 individuals with different ethnic backgrounds from the Human Genome Diversity Project, generalized patterns of the coverage signal were constructed for regions in the vicinity of breakpoints corresponding to various structural variant types. The patterns were used to develop a software package for fast detection of anomalies in the coverage signal.</p>\",\"PeriodicalId\":493,\"journal\":{\"name\":\"Biophysics\",\"volume\":\"68 5\",\"pages\":\"755 - 759\"},\"PeriodicalIF\":4.0330,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biophysics\",\"FirstCategoryId\":\"4\",\"ListUrlMain\":\"https://link.springer.com/article/10.1134/S0006350923050056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Biochemistry, Genetics and Molecular Biology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biophysics","FirstCategoryId":"4","ListUrlMain":"https://link.springer.com/article/10.1134/S0006350923050056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
Searching for Sequencing Signal Anomalies Associated with Genomic Structural Variations
Genomic structural variations (SVs) are among the main sources of genetic diversity. Structural variants as mutagens may significantly affect human health, causing hereditary diseases and cancers. Existing methods analyze high-throughput sequencing data to find structural variants. Despite substantial progress in their development, the methods still fail to detect structural variations with an accuracy sufficient for their use in diagnosis. Analysis of the sequencing coverage signal (i.e., the number of aligned sequencing reads for every point of a genome) holds the new potential for designing approaches to structural variation detection and can be used as time-series analysis. A method to detect repetitive patterns in the coverage signal was developed based on the time series-assessing algorithms KNN (K-nearest neighbor) and SAX (Symbolic Aggregation Approximation). Using the rich dataset encompassing the full genomes of 911 individuals with different ethnic backgrounds from the Human Genome Diversity Project, generalized patterns of the coverage signal were constructed for regions in the vicinity of breakpoints corresponding to various structural variant types. The patterns were used to develop a software package for fast detection of anomalies in the coverage signal.
BiophysicsBiochemistry, Genetics and Molecular Biology-Biophysics
CiteScore
1.20
自引率
0.00%
发文量
67
期刊介绍:
Biophysics is a multidisciplinary international peer reviewed journal that covers a wide scope of problems related to the main physical mechanisms of processes taking place at different organization levels in biosystems. It includes structure and dynamics of macromolecules, cells and tissues; the influence of environment; energy transformation and transfer; thermodynamics; biological motility; population dynamics and cell differentiation modeling; biomechanics and tissue rheology; nonlinear phenomena, mathematical and cybernetics modeling of complex systems; and computational biology. The journal publishes short communications devoted and review articles.