Haplotype parsing: methods for extracting information from human genetic variations.

Russell Schwartz
{"title":"Haplotype parsing: methods for extracting information from human genetic variations.","authors":"Russell Schwartz","doi":"10.2165/00822942-200403020-00012","DOIUrl":null,"url":null,"abstract":"<p><p>While the shared consensus genetic sequence of our species contains a great deal of information about our common biology, there is also much to be learned from the subtle genetic variations across our species. These variations are believed to be generally of little or no direct functional significance and predominantly reflect the chance accumulation of small genetic changes since our emergence as a species. Therefore, they carry little useful information when observed in a single individual. When tallied across a whole population though, these chance mutations can teach us a great deal about our evolutionary history and the patterns of inheritance in particular individuals. In particular, frequently observed patterns of single nucleotide polymorphisms (SNPs) in a population can identify segments of chromosome that have been passed down largely intact through long stretches of our evolution. Finding these frequently conserved chromosomal segments, or haplotypes, and developing methods to identify haplotype patterns in particular individuals, will in turn help us to identify those particular segments that carry genetic factors influencing risk for many common human diseases. To make the best use of this data, we will need to develop new models for the encoding of information in genome variations--the \"language of genetic variation\"--and new algorithms for fitting datasets to those models. This article surveys past work by the author and colleagues on this problem, utilising computational methods for locating frequent patterns in haploid sequence data, and \"parsing\" sequences so as to optimally explain them given the knowledge of the general population structure. The author's recent work in this area has been compiled into a set of computational tools available at http://www-2.cs.cmu.edu/~russells/software/hapmotif.html.</p>","PeriodicalId":87049,"journal":{"name":"Applied bioinformatics","volume":"3 2-3","pages":"181-91"},"PeriodicalIF":0.0000,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2165/00822942-200403020-00012","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2165/00822942-200403020-00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

While the shared consensus genetic sequence of our species contains a great deal of information about our common biology, there is also much to be learned from the subtle genetic variations across our species. These variations are believed to be generally of little or no direct functional significance and predominantly reflect the chance accumulation of small genetic changes since our emergence as a species. Therefore, they carry little useful information when observed in a single individual. When tallied across a whole population though, these chance mutations can teach us a great deal about our evolutionary history and the patterns of inheritance in particular individuals. In particular, frequently observed patterns of single nucleotide polymorphisms (SNPs) in a population can identify segments of chromosome that have been passed down largely intact through long stretches of our evolution. Finding these frequently conserved chromosomal segments, or haplotypes, and developing methods to identify haplotype patterns in particular individuals, will in turn help us to identify those particular segments that carry genetic factors influencing risk for many common human diseases. To make the best use of this data, we will need to develop new models for the encoding of information in genome variations--the "language of genetic variation"--and new algorithms for fitting datasets to those models. This article surveys past work by the author and colleagues on this problem, utilising computational methods for locating frequent patterns in haploid sequence data, and "parsing" sequences so as to optimally explain them given the knowledge of the general population structure. The author's recent work in this area has been compiled into a set of computational tools available at http://www-2.cs.cmu.edu/~russells/software/hapmotif.html.

单倍型分析:从人类遗传变异中提取信息的方法。
虽然我们人类共有的共识基因序列包含了大量关于我们共同生物学的信息,但从我们物种之间微妙的遗传变异中也有很多东西需要学习。这些变异通常被认为很少或没有直接的功能意义,主要反映了自我们作为一个物种出现以来小的遗传变化的偶然积累。因此,当观察单个个体时,它们携带的有用信息很少。当对整个种群进行统计时,这些偶然的突变可以告诉我们很多关于我们的进化史和特定个体的遗传模式的信息。特别是,在一个群体中经常观察到的单核苷酸多态性(SNPs)模式可以识别出在我们的进化过程中大部分完好无损地遗传下来的染色体片段。找到这些经常保守的染色体片段或单倍型,并开发方法来识别特定个体的单倍型模式,将反过来帮助我们识别那些携带影响许多常见人类疾病风险的遗传因素的特定片段。为了充分利用这些数据,我们需要开发新的模型来编码基因组变异中的信息——“遗传变异的语言”——以及新的算法来将数据集拟合到这些模型中。本文综述了作者及其同事在这一问题上的过去工作,利用计算方法在单倍体序列数据中定位频繁模式,并“解析”序列,以便在已知一般群体结构的情况下最佳地解释它们。作者最近在这一领域的工作已汇编成一套计算工具,可在http://www-2.cs.cmu.edu/~russells/software/hapmotif.html上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信