Computational systems bioinformatics. Computational Systems Bioinformatics Conference最新文献

筛选
英文 中文
Novel Gene Discovery in the Human Malaria Parasite using Nucleosome Positioning Data. 利用核糖体定位数据发现人类疟疾寄生虫中的新基因
N Pokhriyal, N Ponts, E Y Harris, K G Le Roch, S Lonardi
{"title":"Novel Gene Discovery in the Human Malaria Parasite using Nucleosome Positioning Data.","authors":"N Pokhriyal, N Ponts, E Y Harris, K G Le Roch, S Lonardi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recent genome-wide studies on nucleosome positioning in model organisms have shown strong evidence that nucleosome landscapes in the proximity of protein-coding genes exhibit regular characteristic patterns. Here, we propose a computational framework to discover novel genes in the human malaria parasite genome <i>P. falciparum</i> using nucleosome positioning inferred from MAINE-seq data. We rely on a classifier trained on the nucleosome landscape profiles of experimentally verified genes, and then used to discover new genes (without considering the primary DNA sequence). Cross-validation experiments show that our classifier is very accurate. About two thirds of the locations reported by the classifier match experimentally determined expressed sequence tags in GenBank, for which no gene has been annotated in the human malaria parasite.</p>","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"9 ","pages":"124-135"},"PeriodicalIF":0.0,"publicationDate":"2010-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4112967/pdf/nihms504365.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32547026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating support for protein-protein interaction data with applications to function prediction. 估计蛋白质-蛋白质相互作用数据在功能预测中的应用支持度。
Erliang Zeng, C. Ding, G. Narasimhan, S. Holbrook
{"title":"Estimating support for protein-protein interaction data with applications to function prediction.","authors":"Erliang Zeng, C. Ding, G. Narasimhan, S. Holbrook","doi":"10.1142/9781848162648_0007","DOIUrl":"https://doi.org/10.1142/9781848162648_0007","url":null,"abstract":"Almost every cellular process requires the interactions of pairs or larger complexes of proteins. High throughput protein-protein interaction (PPI) data have been generated using techniques such as the yeast two-hybrid systems, mass spectrometry method, and many more. Such data provide us with a new perspective to predict protein functions and to generate protein-protein interaction networks, and many recent algorithms have been developed for this purpose. However, PPI data generated using high throughput techniques contain a large number of false positives. In this paper, we have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resnik's formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 1","pages":"73-84"},"PeriodicalIF":0.0,"publicationDate":"2008-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64003189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A max-flow based approach to the identification of protein complexes using protein interaction and microarray data. 利用蛋白质相互作用和微阵列数据,基于最大流量的方法来鉴定蛋白质复合物。
Jianxing Feng, Rui Jiang, Tao Jiang
{"title":"A max-flow based approach to the identification of protein complexes using protein interaction and microarray data.","authors":"Jianxing Feng,&nbsp;Rui Jiang,&nbsp;Tao Jiang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The emergence of high-throughput technologies leads to abundant protein-protein interaction (PPI) data and microarray gene expression profiles, and provides a great opportunity for the identification of novel protein complexes using computational methods. Although it has been demonstrated in the literature that methods using protein-protein interaction data alone can successfully predict a large number of protein complexes, the incorporation of gene expression profiles could help refine the putative complexes and hence improve the accuracy of the computational methods. By combining protein-protein interaction data and microarray gene expression profiles, we propose a novel Graph Fragmentation Algorithm (GFA) for protein complex identification. Adapted from a classical max-flow algorithm for finding the (weighted) densest subgraphs, GFA first finds large (weighted) dense subgraphs in a protein-protein interaction network and then breaks each such subgraph into fragments iteratively by weighting its nodes appropriately in terms of their corresponding log fold changes in the microarray data, until the fragment subgraphs are sufficiently small. Our extensive tests on three widely used protein-protein interaction datasets and comparisons with the latest methods for protein complex identification demonstrate the superior performance of our method in terms of accuracy, efficiency, and capability in predicting novel protein complexes. Given the high specificity (or precision) that our method has achieved, we conjecture that our prediction results imply more than 200 novel protein complexes.</p>","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 ","pages":"51-62"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28336172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSDash: mass spectrometry database and search. 质谱数据库和搜索。
Zhan Wu, Gilles Lajoie, Bin Ma
{"title":"MSDash: mass spectrometry database and search.","authors":"Zhan Wu,&nbsp;Gilles Lajoie,&nbsp;Bin Ma","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Along with the wide application of mass spectrometry in proteomics, more and more mass spectrometry data are becoming publicly available. Several public mass spectrometry data repositories have been built on the Internet. However, most of these repositories are devoid of effective searching methods. In this paper we describe a new mass spectrometry data library, and a novel method to efficiently index and search in the library for spectra that are similar to a query spectrum. A public online server have been set up and demonstrated outstanding speed and scalability of our methods. Together with the mass spectrometry library, our searching method can improve the protein identification confidence by comparing a spectrum with the ones that are already characterized in the database. The searching method can also be used alone to cluster the similar spectra in a mass spectrometry dataset together, in order to to improve the speed and accuracy of the protein identification or quantification.</p>","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 ","pages":"63-71"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28336173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Error tolerant sibship reconstruction in wild populations. 野生种群的容错兄弟姐妹重建。
Saad I Sheikh, Tanya Y Berger-Wolf, Mary V Ashley, Isabel C Caballero, Wanpracha Chaovalitwongse, Bhaskar DasGupta
{"title":"Error tolerant sibship reconstruction in wild populations.","authors":"Saad I Sheikh,&nbsp;Tanya Y Berger-Wolf,&nbsp;Mary V Ashley,&nbsp;Isabel C Caballero,&nbsp;Wanpracha Chaovalitwongse,&nbsp;Bhaskar DasGupta","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Kinship analysis using genetic data is important for many biological applications, including many in conservation biology. Wide availability of microsatellites has boosted studies in wild populations that rely on the knowledge of kinship, particularly sibling relationships (sibship). While there exist many methods for reconstructing sibling relationships, almost none account for errors and mutations in microsatellite data, which are prevalent and affect the quality of reconstruction. We present an error-tolerant method for reconstructing sibling relationships based on the concept of consensus methods. We test our approach on both real and simulated data, with both pre-existing and introduced errors. Our method is highly accurate on almost all simulations, giving over 90% accuracy in most cases. Ours is the first method designed to tolerate errors while making no assumptions about the population or the sampling.</p>","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 ","pages":"273-84"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28337729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Iterative non-sequential protein structural alignment. 迭代非顺序蛋白质结构比对。
Saeed Salem, Mohammed J Zaki
{"title":"Iterative non-sequential protein structural alignment.","authors":"Saeed Salem,&nbsp;Mohammed J Zaki","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Structural similarity between proteins gives us insights on the evolutionary relationship between proteins which have low sequence similarity. In this paper, we present a novel approach called STSA for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process, a superposition step and an alignment step, until convergence. Given two superposed structures, we propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of STSA alignments is evident in the high agreement it has with the reference alignments in the challenging-to-align RPIC set. Moreover, on a dataset of 4410 protein pairs selected from the CATH database, STSA has a high sensitivity and high specificity values and is competitive with state-of-the-art alignment methods and gives longer alignments with lower rmsd. The STSA software along with the data sets will be made available on line at http://www.cs.rpi.edu/-zaki/software/STSA.</p>","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 ","pages":"183-94"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28337243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Bayes error for protein structure model selection by stability mutagenesis. 基于稳定性诱变的蛋白质结构模型选择贝叶斯误差优化。
Xiaoduan Ye, A. Friedman, C. Bailey-Kellogg
{"title":"Optimizing Bayes error for protein structure model selection by stability mutagenesis.","authors":"Xiaoduan Ye, A. Friedman, C. Bailey-Kellogg","doi":"10.1142/9781848162648_0009","DOIUrl":"https://doi.org/10.1142/9781848162648_0009","url":null,"abstract":"Site-directed mutagenesis affects protein stability in a manner dependent on the local structural environment of the mutated residue; e.g., a hydrophobic to polar substitution would behave differently in the core vs. on the surface of the protein. Thus site-directed mutagenesis followed by stability measurement enables evaluation of and selection among predicted structure models, based on consistency between predicted and experimental stability changes (DeltaDeltaGo values). This paper develops a method for planning a set of individual site-directed mutations for protein structure model selection, so as to minimize the Bayes error, i.e., the probability of choosing the wrong model. While in general it is hard to calculate exactly the multi-dimensional Bayes error defined by a set of mutations, we leverage the structure of \"DeltaDeltaGo space\" to develop tight upper and lower bounds. We further develop a lower bound on the Bayes error of any plan that uses a fixed number of mutations from a set of candidates. We use this bound in a branch-and-bound planning algorithm to find optimal and near-optimal plans. We demonstrate the significance and effectiveness of this approach in planning mutations for elucidating the structure of the pTfa chaperone protein from bacteriophage lambda.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 1","pages":"99-108"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64003318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fast and accurate multi-class protein fold recognition with spatial sample kernels. 利用空间样本核快速准确地识别多类蛋白质折叠。
P. Kuksa, Pai-Hsi Huang, V. Pavlovic
{"title":"Fast and accurate multi-class protein fold recognition with spatial sample kernels.","authors":"P. Kuksa, Pai-Hsi Huang, V. Pavlovic","doi":"10.1142/9781848162648_0012","DOIUrl":"https://doi.org/10.1142/9781848162648_0012","url":null,"abstract":"Establishing structural or functional relationship between sequences, for instance to infer the structural class of an unannotated protein, is a key task in biological sequence analysis. Recent computational methods such as profile and neighborhood mismatch kernels have shown very promising results for protein sequence classification, at the cost of high computational complexity. In this study we address the multi-class sequence classification problems using a class of string-based kernels, the sparse spatial sample kernels (SSSK), that are both biologically motivated and efficient to compute. The proposed methods can work with very large databases of protein sequences and show substantial improvements in computing time over the existing methods. Application of the SSSK to the multi-class protein prediction problems (fold recognition and remote homology detection) yields significantly better performance than existing state-of-the-art algorithms.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"59 1","pages":"133-43"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64003404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Consistent alignment of metabolic pathways without abstraction. 一致的排列代谢途径没有抽象。
F. Ay, Tamer Kahveci, V. de Crécy-Lagard
{"title":"Consistent alignment of metabolic pathways without abstraction.","authors":"F. Ay, Tamer Kahveci, V. de Crécy-Lagard","doi":"10.1142/9781848162648_0021","DOIUrl":"https://doi.org/10.1142/9781848162648_0021","url":null,"abstract":"Pathways show how different biochemical entities interact with each other to perform vital functions for the survival of organisms. Similarities between pathways indicate functional similarities that are difficult to identify by comparing the individual entities that make up those pathways. When interacting entities are of single type, the problem of identifying similarities reduces to graph isomorphism problem. However, for pathways with varying types of entities, such as metabolic pathways, alignment problem is more challenging. Existing methods, often, address the metabolic pathway alignment problem by ignoring all the entities except for one type. This kind of abstraction reduces the relevance of the alignment significantly as it causes losses in the information content. In this paper, we develop a method to solve the pairwise alignment problem for metabolic pathways. One distinguishing feature of our method is that it aligns reactions, compounds and enzymes without abstraction of pathways. We pursue the intuition that both pairwise similarities of entities (homology) and their organization (topology) are crucial for metabolic pathway alignment. In our algorithm, we account for both by creating an eigenvalue problem for each entity type. We enforce the consistency by considering the reachability sets of the aligned entities. Our experiments show that, our method finds biologically and statistically significant alignments in the order of seconds for pathways with approximately 100 entities.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 1","pages":"237-48"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64003751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Error tolerant sibship reconstruction in wild populations. 野生种群的容错兄弟姐妹重建。
S. Sheikh, T. Berger-Wolf, M. Ashley, I. Caballero, W. Chaovalitwongse, B. Dasgupta
{"title":"Error tolerant sibship reconstruction in wild populations.","authors":"S. Sheikh, T. Berger-Wolf, M. Ashley, I. Caballero, W. Chaovalitwongse, B. Dasgupta","doi":"10.1142/9781848162648_0024","DOIUrl":"https://doi.org/10.1142/9781848162648_0024","url":null,"abstract":"Kinship analysis using genetic data is important for many biological applications, including many in conservation biology. Wide availability of microsatellites has boosted studies in wild populations that rely on the knowledge of kinship, particularly sibling relationships (sibship). While there exist many methods for reconstructing sibling relationships, almost none account for errors and mutations in microsatellite data, which are prevalent and affect the quality of reconstruction. We present an error-tolerant method for reconstructing sibling relationships based on the concept of consensus methods. We test our approach on both real and simulated data, with both pre-existing and introduced errors. Our method is highly accurate on almost all simulations, giving over 90% accuracy in most cases. Ours is the first method designed to tolerate errors while making no assumptions about the population or the sampling.","PeriodicalId":72665,"journal":{"name":"Computational systems bioinformatics. Computational Systems Bioinformatics Conference","volume":"7 1","pages":"273-84"},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64003910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信