Jeremy Wang, Fernando Pardo-Manual de Villena, Kyle J Moore, Wei Wang, Qi Zhang, Leonard McMillan
{"title":"全基因组兼容SNP间隔及其性质。","authors":"Jeremy Wang, Fernando Pardo-Manual de Villena, Kyle J Moore, Wei Wang, Qi Zhang, Leonard McMillan","doi":"10.1145/1854776.1854788","DOIUrl":null,"url":null,"abstract":"<p><p>Intraspecific genomes can be subdivided into blocks with limited diversity. Understanding the distribution and structure of these blocks will help to unravel many biological problems including the identification of genes associated with complex diseases, finding the ancestral origins of a given population, and localizing regions of historical recombination, gene conversion, and homoplasy. We present methods for partitioning a genome into blocks for which there are no apparent recombinations, thus providing parsimonious sets of compatible genome intervals based on the four-gamete test. Our contribution is a thorough analysis of the problem of dividing a genome into compatible intervals, in terms of its computational complexity, and by providing an achievable lower-bound on the minimal number of intervals required to cover an entire data set. In general, such minimal interval partitions are not unique. However, we identify properties that are common to every possible solution. We also define the notion of an interval set that achieves the interval lower-bound, yet maximizes interval overlap. We demonstrate algorithms for partitioning both haplotype data from inbred mice as well as outbred heterozygous genotype data using extensions of the standard four-gamete test. These methods allow our algorithms to be applied to a wide range of genomic data sets.</p>","PeriodicalId":90977,"journal":{"name":"The 2010 ACM International Conference on Bioinformatics and Computational Biology : ACM-BCB 2010 : Niagara Falls, New York, U.S.A., August 2-4, 2010. ACM International Conference on Bioinformatics and Computational Biology (1st : 2010 :...","volume":"2010 ","pages":"43-52"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1854776.1854788","citationCount":"19","resultStr":"{\"title\":\"Genome-wide compatible SNP intervals and their properties.\",\"authors\":\"Jeremy Wang, Fernando Pardo-Manual de Villena, Kyle J Moore, Wei Wang, Qi Zhang, Leonard McMillan\",\"doi\":\"10.1145/1854776.1854788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Intraspecific genomes can be subdivided into blocks with limited diversity. Understanding the distribution and structure of these blocks will help to unravel many biological problems including the identification of genes associated with complex diseases, finding the ancestral origins of a given population, and localizing regions of historical recombination, gene conversion, and homoplasy. We present methods for partitioning a genome into blocks for which there are no apparent recombinations, thus providing parsimonious sets of compatible genome intervals based on the four-gamete test. Our contribution is a thorough analysis of the problem of dividing a genome into compatible intervals, in terms of its computational complexity, and by providing an achievable lower-bound on the minimal number of intervals required to cover an entire data set. In general, such minimal interval partitions are not unique. However, we identify properties that are common to every possible solution. We also define the notion of an interval set that achieves the interval lower-bound, yet maximizes interval overlap. We demonstrate algorithms for partitioning both haplotype data from inbred mice as well as outbred heterozygous genotype data using extensions of the standard four-gamete test. These methods allow our algorithms to be applied to a wide range of genomic data sets.</p>\",\"PeriodicalId\":90977,\"journal\":{\"name\":\"The 2010 ACM International Conference on Bioinformatics and Computational Biology : ACM-BCB 2010 : Niagara Falls, New York, U.S.A., August 2-4, 2010. ACM International Conference on Bioinformatics and Computational Biology (1st : 2010 :...\",\"volume\":\"2010 \",\"pages\":\"43-52\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/1854776.1854788\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2010 ACM International Conference on Bioinformatics and Computational Biology : ACM-BCB 2010 : Niagara Falls, New York, U.S.A., August 2-4, 2010. ACM International Conference on Bioinformatics and Computational Biology (1st : 2010 :...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1854776.1854788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2010 ACM International Conference on Bioinformatics and Computational Biology : ACM-BCB 2010 : Niagara Falls, New York, U.S.A., August 2-4, 2010. ACM International Conference on Bioinformatics and Computational Biology (1st : 2010 :...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1854776.1854788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Genome-wide compatible SNP intervals and their properties.
Intraspecific genomes can be subdivided into blocks with limited diversity. Understanding the distribution and structure of these blocks will help to unravel many biological problems including the identification of genes associated with complex diseases, finding the ancestral origins of a given population, and localizing regions of historical recombination, gene conversion, and homoplasy. We present methods for partitioning a genome into blocks for which there are no apparent recombinations, thus providing parsimonious sets of compatible genome intervals based on the four-gamete test. Our contribution is a thorough analysis of the problem of dividing a genome into compatible intervals, in terms of its computational complexity, and by providing an achievable lower-bound on the minimal number of intervals required to cover an entire data set. In general, such minimal interval partitions are not unique. However, we identify properties that are common to every possible solution. We also define the notion of an interval set that achieves the interval lower-bound, yet maximizes interval overlap. We demonstrate algorithms for partitioning both haplotype data from inbred mice as well as outbred heterozygous genotype data using extensions of the standard four-gamete test. These methods allow our algorithms to be applied to a wide range of genomic data sets.