{"title":"一种混合可视化隐马尔可夫模型方法识别DNA序列中的cg岛","authors":"G. Rambally, R. Rambally","doi":"10.1109/SECON.2008.4494244","DOIUrl":null,"url":null,"abstract":"CG-islands arc runs of DNA where the CG dinucleotide has much higher-than-normal frequency, indicating the likely presence of important molecular genetic biomarkers. Given a DNA sequence. the CG-island location problem involves finding regions of the DNA sequence where there are high frequencies of the CG dinucleolide, without any prior knowledge of what these regions look like. This paper proposes a hybrid visualization Hidden Markov Model (HMM) algorithm for finding CG-islands in DNA sequences. In the proposed method, each nucleotide base {A, T, C, G} in a DNA sequence is assigned a unique integer as a function of its immediate subsequent base, allowing the DNA sequence to be mapped to a corresponding numeric sequence. This numeric sequence is then plotted in 3-D space from which approximate regions with high frequencies of the CG dinucleotide are identified. These regions are represented as Hidden Markov Models from which we calculate the precise endpoints of the CG-islands. The major advantage of the proposed hybrid visualization HMM algorithm for locating CG-islands is its low computational complexity compared to other widely used algorithms.","PeriodicalId":188817,"journal":{"name":"IEEE SoutheastCon 2008","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A hybrid visualization Hidden Markov Model approach to identifying CG-islands in DNA sequences\",\"authors\":\"G. Rambally, R. Rambally\",\"doi\":\"10.1109/SECON.2008.4494244\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"CG-islands arc runs of DNA where the CG dinucleotide has much higher-than-normal frequency, indicating the likely presence of important molecular genetic biomarkers. Given a DNA sequence. the CG-island location problem involves finding regions of the DNA sequence where there are high frequencies of the CG dinucleolide, without any prior knowledge of what these regions look like. This paper proposes a hybrid visualization Hidden Markov Model (HMM) algorithm for finding CG-islands in DNA sequences. In the proposed method, each nucleotide base {A, T, C, G} in a DNA sequence is assigned a unique integer as a function of its immediate subsequent base, allowing the DNA sequence to be mapped to a corresponding numeric sequence. This numeric sequence is then plotted in 3-D space from which approximate regions with high frequencies of the CG dinucleotide are identified. These regions are represented as Hidden Markov Models from which we calculate the precise endpoints of the CG-islands. The major advantage of the proposed hybrid visualization HMM algorithm for locating CG-islands is its low computational complexity compared to other widely used algorithms.\",\"PeriodicalId\":188817,\"journal\":{\"name\":\"IEEE SoutheastCon 2008\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE SoutheastCon 2008\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SECON.2008.4494244\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE SoutheastCon 2008","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SECON.2008.4494244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A hybrid visualization Hidden Markov Model approach to identifying CG-islands in DNA sequences
CG-islands arc runs of DNA where the CG dinucleotide has much higher-than-normal frequency, indicating the likely presence of important molecular genetic biomarkers. Given a DNA sequence. the CG-island location problem involves finding regions of the DNA sequence where there are high frequencies of the CG dinucleolide, without any prior knowledge of what these regions look like. This paper proposes a hybrid visualization Hidden Markov Model (HMM) algorithm for finding CG-islands in DNA sequences. In the proposed method, each nucleotide base {A, T, C, G} in a DNA sequence is assigned a unique integer as a function of its immediate subsequent base, allowing the DNA sequence to be mapped to a corresponding numeric sequence. This numeric sequence is then plotted in 3-D space from which approximate regions with high frequencies of the CG dinucleotide are identified. These regions are represented as Hidden Markov Models from which we calculate the precise endpoints of the CG-islands. The major advantage of the proposed hybrid visualization HMM algorithm for locating CG-islands is its low computational complexity compared to other widely used algorithms.