M. Hirosawa, T. Nagase, Y. Murahashi, R. Kikuno, O. Ohara
{"title":"Identification of novel transcribed sequences on human chromosome 22 by expressed sequence tag mapping.","authors":"M. Hirosawa, T. Nagase, Y. Murahashi, R. Kikuno, O. Ohara","doi":"10.1093/DNARES/8.1.1","DOIUrl":null,"url":null,"abstract":"To identify sequences on the human genome that are actually transcribed, we mapped expressed sequence tags (ESTs) of long cDNAs ranging from 4 kb to 7 kb along a 33.4-Mb sequence of human chromosome 22, the first human chromosome entirely sequenced. By the EST mapping of 30,683 long cDNAs in silico, 603 cDNA sequences were found to locate on chromosome 22 and classified into 169 clusters. Comparison of the genomic loci of these cDNA sequences with 679 genes already annotated on chromosome 22q revealed that 46 clusters represented newly identified transcribed sequences. To further characterize these sequences, we sequenced 12 cDNAs in their entirety out of 46 clusters. Of these 12 cDNAs, 6 were predicted to include a protein-coding region while the remaining 6 were unlikely to encode proteins. Interestingly, 3 out of the 12 cDNAs had the nucleotide sequences of the opposite strands of the genes previously annotated, which suggested that these genomic regions were transcribed bi-directionally. In addition to these newly identified 12 cDNAs, another 12 cDNAs were entirely sequenced since these cDNAs were likely to contain new information about the predicted protein-coding sequences previously annotated. In the cases of KIAA1670 and KIAA1672, these single cDNA sequences covered two separately annotated transcribed regions. For example, the sequence of a clone for KIAA1670 indicated that the CHKL and CPT1B genes were co-transcribed as a contiguous transcript without making both the protein-coding regions fused. In conclusion, the mapping of ESTs derived from long cDNAs followed by sequencing of the entire cDNAs provided indispensable information for the precise annotation of genes on the genome together with ESTs derived from short cDNAs.","PeriodicalId":11212,"journal":{"name":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","volume":"8 1 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2001-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/DNARES/8.1.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
To identify sequences on the human genome that are actually transcribed, we mapped expressed sequence tags (ESTs) of long cDNAs ranging from 4 kb to 7 kb along a 33.4-Mb sequence of human chromosome 22, the first human chromosome entirely sequenced. By the EST mapping of 30,683 long cDNAs in silico, 603 cDNA sequences were found to locate on chromosome 22 and classified into 169 clusters. Comparison of the genomic loci of these cDNA sequences with 679 genes already annotated on chromosome 22q revealed that 46 clusters represented newly identified transcribed sequences. To further characterize these sequences, we sequenced 12 cDNAs in their entirety out of 46 clusters. Of these 12 cDNAs, 6 were predicted to include a protein-coding region while the remaining 6 were unlikely to encode proteins. Interestingly, 3 out of the 12 cDNAs had the nucleotide sequences of the opposite strands of the genes previously annotated, which suggested that these genomic regions were transcribed bi-directionally. In addition to these newly identified 12 cDNAs, another 12 cDNAs were entirely sequenced since these cDNAs were likely to contain new information about the predicted protein-coding sequences previously annotated. In the cases of KIAA1670 and KIAA1672, these single cDNA sequences covered two separately annotated transcribed regions. For example, the sequence of a clone for KIAA1670 indicated that the CHKL and CPT1B genes were co-transcribed as a contiguous transcript without making both the protein-coding regions fused. In conclusion, the mapping of ESTs derived from long cDNAs followed by sequencing of the entire cDNAs provided indispensable information for the precise annotation of genes on the genome together with ESTs derived from short cDNAs.