David S H Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung
{"title":"SARS 和其他冠状病毒中的倒位基因","authors":"David S H Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung","doi":"10.1287/ijoc.1040.0087","DOIUrl":null,"url":null,"abstract":"<p><p>With the identification of a novel coronavirus associated with the <i>severe acute respiratory syndrome</i> (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a <i>z</i> score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962-25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712-22723 and 22796-22807. Some further investigations into possible biological implications of these palindrome features are proposed.</p>","PeriodicalId":13620,"journal":{"name":"Informs Journal on Computing","volume":"16 4","pages":"331-340"},"PeriodicalIF":2.3000,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066412/pdf/nihms583805.pdf","citationCount":"0","resultStr":"{\"title\":\"Palindromes in SARS and Other Coronaviruses.\",\"authors\":\"David S H Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung\",\"doi\":\"10.1287/ijoc.1040.0087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>With the identification of a novel coronavirus associated with the <i>severe acute respiratory syndrome</i> (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a <i>z</i> score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962-25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712-22723 and 22796-22807. Some further investigations into possible biological implications of these palindrome features are proposed.</p>\",\"PeriodicalId\":13620,\"journal\":{\"name\":\"Informs Journal on Computing\",\"volume\":\"16 4\",\"pages\":\"331-340\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2004-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066412/pdf/nihms583805.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informs Journal on Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1287/ijoc.1040.0087\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informs Journal on Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1287/ijoc.1040.0087","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
With the identification of a novel coronavirus associated with the severe acute respiratory syndrome (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a z score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962-25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712-22723 and 22796-22807. Some further investigations into possible biological implications of these palindrome features are proposed.
期刊介绍:
The INFORMS Journal on Computing (JOC) is a quarterly that publishes papers in the intersection of operations research (OR) and computer science (CS). Most papers contain original research, but we also welcome special papers in a variety of forms, including Feature Articles on timely topics, Expository Reviews making a comprehensive survey and evaluation of a subject area, and State-of-the-Art Reviews that collect and integrate recent streams of research.