Towards automated derivation of biological pathways using high-throughput biological data

Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings. Pub Date : 2003-03-10 DOI:10.1109/BIBE.2003.1188925

Yu Chen, T. Joshi, Ying Xu, Dong Xu

{"title":"Towards automated derivation of biological pathways using high-throughput biological data","authors":"Yu Chen, T. Joshi, Ying Xu, Dong Xu","doi":"10.1109/BIBE.2003.1188925","DOIUrl":null,"url":null,"abstract":"Characterizing biological pathways at the genome scale is one of the most important and challenging tasks in the post genomic era. To address this challenge, we have developed a computational method to systematically and automatically derive partial biological pathways in yeast using high-throughput biological data, including yeast two hybrid data, protein complexes identified from mass spectroscopy, genetics interactions, and microarray gene expression data in yeast Saccharomyces cerevisiae. The inputs of the method are the upstream starting protein (e.g., a sensor of a signal) and the downstream terminal protein (e.g., a transcriptional factor that induces genes to respond the signal); the output of the method is the protein interaction chain between the two proteins. The high-throughput data are coded into a graph of interaction network, where each node represents a protein. The weight of an edge between two nodes models the \"closeness\" of the two represented proteins in the interaction network and it is defined by a rule-based formula according to the high-throughput data and modified by the protein function classification and subcellular localization information. The protein interaction cascade pathway in vivo is predicted as the shortest path identified from the graph of the interaction network using Dijkstra's algorithm. We have also developed a web server of this method (http://compbio.ornl.gov/structure/pathway) for public use. To our knowledge, our method is the first automated method to generally construct partial biological pathways using a suite of high-throughput biological data. This work demonstrates the proof of principle using computational approaches for discoveries of biological pathways with high-throughput data and biological annotation data.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2003.1188925","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Characterizing biological pathways at the genome scale is one of the most important and challenging tasks in the post genomic era. To address this challenge, we have developed a computational method to systematically and automatically derive partial biological pathways in yeast using high-throughput biological data, including yeast two hybrid data, protein complexes identified from mass spectroscopy, genetics interactions, and microarray gene expression data in yeast Saccharomyces cerevisiae. The inputs of the method are the upstream starting protein (e.g., a sensor of a signal) and the downstream terminal protein (e.g., a transcriptional factor that induces genes to respond the signal); the output of the method is the protein interaction chain between the two proteins. The high-throughput data are coded into a graph of interaction network, where each node represents a protein. The weight of an edge between two nodes models the "closeness" of the two represented proteins in the interaction network and it is defined by a rule-based formula according to the high-throughput data and modified by the protein function classification and subcellular localization information. The protein interaction cascade pathway in vivo is predicted as the shortest path identified from the graph of the interaction network using Dijkstra's algorithm. We have also developed a web server of this method (http://compbio.ornl.gov/structure/pathway) for public use. To our knowledge, our method is the first automated method to generally construct partial biological pathways using a suite of high-throughput biological data. This work demonstrates the proof of principle using computational approaches for discoveries of biological pathways with high-throughput data and biological annotation data.

查看原文本刊更多论文

利用高通量生物数据实现生物途径的自动推导

在基因组尺度上表征生物通路是后基因组时代最重要和最具挑战性的任务之一。为了应对这一挑战，我们开发了一种计算方法，利用高通量生物学数据，包括酵母双杂交数据、从质谱中鉴定的蛋白质复合物、遗传相互作用和酵母微阵列基因表达数据，系统地自动推导酵母的部分生物学途径。该方法的输入是上游起始蛋白(例如，信号的传感器)和下游终端蛋白(例如，诱导基因响应信号的转录因子);该方法的输出是两种蛋白质之间的蛋白质相互作用链。高通量数据被编码成相互作用网络图，其中每个节点代表一个蛋白质。两个节点之间的边的权重是相互作用网络中两个所代表的蛋白质的“亲密度”的模型，它根据高通量数据由基于规则的公式定义，并通过蛋白质功能分类和亚细胞定位信息进行修改。利用Dijkstra算法预测体内蛋白质相互作用级联途径为从相互作用网络图中识别出的最短路径。我们还开发了一个这种方法的web服务器(http://compbio.ornl.gov/structure/pathway)供公众使用。据我们所知，我们的方法是第一个使用一套高通量生物学数据来构建部分生物学途径的自动化方法。这项工作展示了利用高通量数据和生物注释数据发现生物途径的计算方法的原理证明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.

自引率

0.00%

发文量