Restriction sites as identification tags for the gene catalog: a 2D gel model.

Applied and theoretical electrophoresis : the official journal of the International Electrophoresis Society Pub Date : 1993-01-01

J R Frey, J R Kettman, I Lefkovits

{"title":"Restriction sites as identification tags for the gene catalog: a 2D gel model.","authors":"J R Frey, J R Kettman, I Lefkovits","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>In our effort to collect, organize and assemble data from lymphocyte cDNA libraries, we assign DNA restriction sites collectively to the spots on two-dimensional (2D) gel patterns. In order to test the efficiency and reliability of such an approach, we have modeled the restriction analysis of cDNA libraries with a panel of restriction endonucleases. The work has two parts. In the first, we have chosen 255 proteins from the EMBL data base and determined whether or not their coding sequences contain restriction sites for the enzymes of our choice. In order to apply a sufficient discriminatory power we decided to use a relatively large number of cleaving enzymes with low and high cutting frequencies. In total, 13 restriction enzymes were chosen, which could distinguish 2(13) or 8192 different restriction site combinations. We have compiled a table in which the absence or presence of restriction sites yields a pattern of 'zeros' and 'ones'. Such a restriction pattern can be read as a binary number. The binary numbers with maximally 13 digits would uniquely assign each of the 255 proteins if the nucleotide sequences would be truly at random. As the restriction sites are not randomly distributed, the 'typing' does not yield a unique assignment. The choice of sequences was not random either. In fact, there are some human nucleotide sequences which possess the same cut number (the decimal equivalent of the binary number representing the restriction pattern). In spite of this redundancy, 141 coding sequences could uniquely be distinguished by the above treatment. In the second part of the project we have used the above mentioned coding sequences to prepare two-dimensional maps (plots of charge vs size) of the same kind as one obtains from experimental 2D gels and submitted such a map together with 13 maps of restriction enzyme treated populations to a computer image analysis. Ideally, one would expect results (cut numbers) congruent to those obtained in the first part of the work. In the modeled system we were confronted with 2D maps which closely resembled the experimental situation (e.g. some spots were close together and overlapping) and instances of incorrect spot detection yielding 'false cut numbers'. From 255 proteins we were able to assign unequivocally 161 proteins. To implement the model in an actual experiment we will perform the digestion with the restriction enzymes in duplicate, and only spots assigned the same cut number upon the two independent treatments will be considered as carrying a valid restriction tag.</p>","PeriodicalId":77007,"journal":{"name":"Applied and theoretical electrophoresis : the official journal of the International Electrophoresis Society","volume":"3 6","pages":"283-96"},"PeriodicalIF":0.0000,"publicationDate":"1993-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and theoretical electrophoresis : the official journal of the International Electrophoresis Society","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In our effort to collect, organize and assemble data from lymphocyte cDNA libraries, we assign DNA restriction sites collectively to the spots on two-dimensional (2D) gel patterns. In order to test the efficiency and reliability of such an approach, we have modeled the restriction analysis of cDNA libraries with a panel of restriction endonucleases. The work has two parts. In the first, we have chosen 255 proteins from the EMBL data base and determined whether or not their coding sequences contain restriction sites for the enzymes of our choice. In order to apply a sufficient discriminatory power we decided to use a relatively large number of cleaving enzymes with low and high cutting frequencies. In total, 13 restriction enzymes were chosen, which could distinguish 2(13) or 8192 different restriction site combinations. We have compiled a table in which the absence or presence of restriction sites yields a pattern of 'zeros' and 'ones'. Such a restriction pattern can be read as a binary number. The binary numbers with maximally 13 digits would uniquely assign each of the 255 proteins if the nucleotide sequences would be truly at random. As the restriction sites are not randomly distributed, the 'typing' does not yield a unique assignment. The choice of sequences was not random either. In fact, there are some human nucleotide sequences which possess the same cut number (the decimal equivalent of the binary number representing the restriction pattern). In spite of this redundancy, 141 coding sequences could uniquely be distinguished by the above treatment. In the second part of the project we have used the above mentioned coding sequences to prepare two-dimensional maps (plots of charge vs size) of the same kind as one obtains from experimental 2D gels and submitted such a map together with 13 maps of restriction enzyme treated populations to a computer image analysis. Ideally, one would expect results (cut numbers) congruent to those obtained in the first part of the work. In the modeled system we were confronted with 2D maps which closely resembled the experimental situation (e.g. some spots were close together and overlapping) and instances of incorrect spot detection yielding 'false cut numbers'. From 255 proteins we were able to assign unequivocally 161 proteins. To implement the model in an actual experiment we will perform the digestion with the restriction enzymes in duplicate, and only spots assigned the same cut number upon the two independent treatments will be considered as carrying a valid restriction tag.

本刊更多论文

限制性内切位点作为基因目录的识别标签:二维凝胶模型。

在我们努力收集、组织和组装淋巴细胞cDNA文库数据的过程中，我们将DNA限制位点集体分配到二维(2D)凝胶模式上的点。为了测试这种方法的效率和可靠性，我们用一组限制性内切酶对cDNA文库进行了限制性内切分析。这项工作分为两部分。首先，我们从EMBL数据库中选择了255个蛋白质，并确定它们的编码序列是否包含我们选择的酶的限制性位点。为了应用足够的区分力，我们决定使用相对大量的具有低切割频率和高切割频率的切割酶。共选择13种限制性内切酶，可区分2(13)或8192种不同的限制性内切位点组合。我们编制了一个表，其中没有或存在限制性位点产生“0”和“1”的模式。这样的限制模式可以被读取为二进制数。如果核苷酸序列真的是随机的，那么最多13位的二进制数将唯一地分配255种蛋白质中的每一种。由于限制位点不是随机分布的，“输入”不会产生唯一的分配。序列的选择也不是随机的。事实上，有一些人类核苷酸序列具有相同的切割数(代表限制模式的二进制数的十进制等效物)。尽管有这种冗余，141个编码序列可以通过上述处理唯一地区分。在项目的第二部分，我们使用上述编码序列制作了与从实验2D凝胶中获得的相同类型的二维地图(电荷与大小的图)，并将该地图与13个限制性内切酶处理群体的地图一起提交给计算机图像分析。理想情况下，人们会期望结果(削减数字)与第一部分工作中获得的结果一致。在建模系统中，我们面对的是与实验情况非常相似的2D地图(例如，一些点靠近在一起并且重叠)，以及错误的点检测产生“错误切割数”的实例。从255个蛋白质中，我们能够明确地分配161个蛋白质。为了在实际实验中实现该模型，我们将使用两个限制性内切酶进行酶切，只有在两个独立处理中分配相同切割数的位点才被认为携带有效的限制性内切标签。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied and theoretical electrophoresis : the official journal of the International Electrophoresis Society

自引率

0.00%

发文量