Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics最新文献

The Atomizer: Extracting Implicit Molecular Structure from Reaction Network Models 雾化器:从反应网络模型中提取隐式分子结构

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2512389

J. Tapia, J. Faeder

引用次数: 13

A Confidence Measure for Model Fitting with X-Ray Crystallography Data x射线晶体学数据模型拟合的置信度度量

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506609

Y. Lei, Ramgopal R. Mettu

{"title":"A Confidence Measure for Model Fitting with X-Ray Crystallography Data","authors":"Y. Lei, Ramgopal R. Mettu","doi":"10.1145/2506583.2506609","DOIUrl":"https://doi.org/10.1145/2506583.2506609","url":null,"abstract":"Structure determination from X-ray crystallography requires numerous stages of iterative refinement between real and reciprocal space. Current methods that fit a model structure to X-ray data therefore utilize a refined experimental electron density map along with a scoring function that characterizes the fit of the density map to structure. Additional information (e.g., from an energy function or conformational statistics) may supplement this score. In this paper, we derive a novel confidence measure for fitting model fragments into X-ray crystallography data. Given any set of conformations under consideration (e.g., a set of sidechain rotamers, or backbone fragments), and a scoring function for those conformations (e.g., least squares fit of the associated model density maps), we give a general-purpose method for assessing the confidence of the best-fit model. For the commonly used least-squares measure of fit, our method analyzes the statistics of the matching scores and estimates the probability that the best-fit conformation is the correct underlying model. To our knowledge, ours is the first method for computing such a confidence measure. To demonstrate the practical utility of our method, we study the problem of sidechain placement and show that our confidence measure can be used to detect and correct incorrect conformational predictions. Over nine proteins with density maps of varying resolutions, the Pearson correlation between predictive accuracy (of least-squares fit) and our confidence measure is quite high, about .89. We show that our approach can guide the use of stereochemical restraints when confidence is low in predictions. We also propose a Bayesian data fusion scheme that integrates our confidence measure to weight the contributon of each source of data, which could potentially be used for combining experimental, modeling, and empirical data in automated structure determination.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125019612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Quantum Sequence Analysis: A New Alignment-free Technique For Analyzing Sequences in Feature Space 量子序列分析:一种新的特征空间序列分析方法

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2512375

M. Daoud

引用次数: 5

Genomic Sequence Fragment Identification using Quasi-Alignment 基因组序列片段的准比对鉴定

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506647

Anurag Nagar, Michael Hahsler

{"title":"Genomic Sequence Fragment Identification using Quasi-Alignment","authors":"Anurag Nagar, Michael Hahsler","doi":"10.1145/2506583.2506647","DOIUrl":"https://doi.org/10.1145/2506583.2506647","url":null,"abstract":"Identification of organisms using their genetic sequences is a popular problem in molecular biology and is used in fields such as metagenomics, molecular phylogenetics and DNA Barcoding. These applications depend on searching large sequence databases for individual matching sequences (e.g., with BLAST) and comparing sequences using multiple sequence alignment (e.g., via Clustal), both of which are computationally expensive and require extensive server resources. We propose a novel method for sequence comparison, analysis, and classification which avoids the need to align sequences at the base level or search a database for similarity. Instead, our method uses alignment-free methods to find probabilistic quasi-alignments for longer (typically 100 base pairs) segments. Clustering is then used to create compact models that can be used to analyze a set of sequences and to score and classify unknown sequences against these models. In this paper we expand prior work in two ways. We show how quasi-alignments can be expanded into larger quasi-aligned sections and we develop a method to classify short sequence fragments. The latter is especially useful when working with Next-Generation Sequencing (NGS) techniques that generate output in the form of relatively short reads. We have conducted extensive experiments using fragments from bacterial 16S rRNA sequences obtained from the Greengenes project and our results show that the new quasi-alignment based approach can provide excellent results as well as overcome some of the restrictions of by the widely used Ribosomal Database Project (RDP) classifier.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125805738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Computational methods for alternative splicing detection using RNA-seq 利用RNA-seq进行选择性剪接检测的计算方法

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506666

Ruolin Liu, J. Dickerson

{"title":"Computational methods for alternative splicing detection using RNA-seq","authors":"Ruolin Liu, J. Dickerson","doi":"10.1145/2506583.2506666","DOIUrl":"https://doi.org/10.1145/2506583.2506666","url":null,"abstract":"RNA-seq technology promises a comprehensive picture of transcriptome. The traditional way of studying differential expression gene is questionable because it fails to consider alternative transcription and post-transcriptional modification. Although some studies have shown that transcript variants from a gene are predominantly generated from alternative transcription, including alternative promoters and transcriptional terminations, rather than splicing mechanisms, more computation methods focus on alternative splicing detection and quantification. Here we are only interested in methods which are able to detect condition-specific difference using RNA-seq and we categorize them into two major classes: Region Quantification (RQ) and Isoform Quantification (IQ). RQ breaks down the gene structure into\"horizontally parallel pieces\", exon units for example, and quantifies the expression in these \"small pieces\" and compares them across different conditions. While IR seeks to separate gene expression into \"vertically parallel isoform\", which itself is a challenging task but is more biologically meaningful, and compares a gene's isoform compositions across different conditions. In addition, based on their ability to localize significantly different regions we can further classify them into \"gene-centric\" or \"exon-centric\" method. The combination of two classification strategies yields 4 categories and we choose one representative for each category. These four representatives are Cufflinks-Cuffdiff package, DEXSeq, DiffSplice and SplicingCompass. We evaluate their performance on alternative splicing analysis using three experiments. The first experiment uses a published RNA-seq data of Arabidopsis under cold condition (NCBI SRA009031). The second experiment is a simulation study using a custom simulator by which we adopt negative binomial model to account for variability across biological replicates. The last experiment makes use of RT-PCR to evaluate the results from different methods.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125582689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RNA-Seq analyses to reveal the human transcriptome landscape RNA-Seq分析揭示人类转录组景观

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506603

N. Deng, D. Zhu

{"title":"RNA-Seq analyses to reveal the human transcriptome landscape","authors":"N. Deng, D. Zhu","doi":"10.1145/2506583.2506603","DOIUrl":"https://doi.org/10.1145/2506583.2506603","url":null,"abstract":"Alternative splicing plays important roles in many biological processes including diseases. It markedly increases the diversity of transcriptome and proteome since over 90% of human genes are alternatively spliced. Recently, the high-throughput RNA-Seq technology makes it possible to better characterize and understand transcriptomes. Differential expression and differential splicing are two fundamental yet crucial analyses to study differences between transcriptomes. The results from analyses may reveal the landscape of human transcriptomes and yield new insight into cell differentiation that may lead to human disease. We present the analysis results from two RNA-Seq data sets to study the transcriptomes of a human disease and a type of human cell differentiation. For the first study, we applied our analysis pipeline to a RNA-Seq data set of human Idiopathic Pulmonary Fibrosis (IPF) disease. We present a joint analysis result of differential expression and differential splicing to view genes from both aspects simultaneously. We also provide several non-differentially spliced genes with splicing variants validated by qRT-PCR experiments. For the second study, we developed a novel computational method, and applied it on a public RNA-Seq data set of human H1 and H1 differentiation into neural progenitor cell lines. We systematically detected many significant differential splicing events falling into five well-known types of alternative splicing. We present the proportion of the five types of detected differential splicing events in this study. For each type of splicing event, we show a case study to demonstrate the detection procedure of the differential splicing event.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122301501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Identifying protein complexes in AP-MS data with negative evidence via soft Markov clustering 利用软马尔可夫聚类技术鉴定AP-MS数据中具有阴性证据的蛋白复合物

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506591

Yu-Keng Shih, S. Parthasarathy

{"title":"Identifying protein complexes in AP-MS data with negative evidence via soft Markov clustering","authors":"Yu-Keng Shih, S. Parthasarathy","doi":"10.1145/2506583.2506591","DOIUrl":"https://doi.org/10.1145/2506583.2506591","url":null,"abstract":"Protein complexes are key units to discover protein mechanism. Traditional protein complex identification methods adopt a soft (overlapping) network clustering algorithm on protein-protein interaction network and predict the clusters as protein complexes. Recently, the AP-MS technique and the scoring method can measure the co-complex relationship among proteins. Unlike traditional PPI networks, AP-MS can provide negative evidence which indicates which proteins are unlikely to be in the same protein complex. However, most of existing network clustering algorithms cannot utilize this negative similarity score. In this paper, we propose a soft network clustering algorithm, SR-MCL-N, which can take into account negative similarity scores. SR-MCL-N is a variation of a previous algorithm, SR-MCL, which is a network clustering algorithm based on the transition flow. Additionally, since the scoring approach we use produces a dense similarity matrix, a sparsification technique is adopted on the similarity matrix. Based on the gold standard CYC2008 and GO terms, we first show that the sparsification can not only speed up SR-MCL-N, but also let SR-MCL-N generate more accurate clusters. SR-MCL-N is then compared against SR-MCL and a hierarchical algorithm which also considers negative similarity score. The results indicate that our algorithm outperforms others since SR-MCL-N not only generates overlapped clusters but also additionally takes negative similarity score into account.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128740947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506651

Guillermo Palma, Maria-Esther Vidal, E. Haag, L. Raschid, Andreas Thor

{"title":"Measuring Relatedness Between Scientific Entities in Annotation Datasets","authors":"Guillermo Palma, Maria-Esther Vidal, E. Haag, L. Raschid, Andreas Thor","doi":"10.1145/2506583.2506651","DOIUrl":"https://doi.org/10.1145/2506583.2506651","url":null,"abstract":"Linked Open Data has made available a diversity of scientific collections where scientists have annotated entities in the datasets with controlled vocabulary terms (CV terms) from ontologies. These semantic annotations encode scientific knowledge which is captured in annotation datasets. One can mine these datasets to discover relationships and patterns between entities. Determining the relatedness (or similarity) between entities becomes a building block for graph pattern mining, e.g., identifying drug-drug relationships could depend on the similarity of the diseases (conditions) that are associated with each drug. Diverse similarity metrics have been proposed in the literature, e.g., i) string-similarity metrics; ii) path-similarity metrics; iii) topological-similarity metrics; all measure relatedness in a given taxonomy or ontology. In this paper, we consider a novel annotation similarity metric AnnSim that measures the relatedness between two entities in terms of the similarity of their annotations. We model AnnSim as a 1-to-1 maximal weighted bipartite match, and we exploit properties of existing solvers to provide an efficient solution. We empirically study the effectiveness of AnnSim on real-world datasets of genes and their GO annotations, clinical trials, and a human disease benchmark. Our results suggest that AnnSim can provide a deeper understanding of the relatedness of concepts and can provide an explanation of potential novel patterns.","PeriodicalId":287007,"journal":{"name":"Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124743931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Estimating the Number of Manually Segmented Cellular Objects Required to Evaluate the Performance of a Segmentation Algorithm 估计评估分割算法性能所需的手工分割的细胞目标的数量

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2512384

A. Peskin, J. Chalfoun, K. Kafadar, J. Elliott

引用次数: 1

A Constrained K-shortest Path Algorithm to Rank the Topologies of the Protein Secondary Structure Elements Detected in CryoEM Volume Maps 一种约束k最短路径算法对CryoEM体积图中检测到的蛋白质二级结构元素的拓扑进行排序

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics Pub Date : 2013-09-22 DOI: 10.1145/2506583.2506705

Kamal Al-Nasr, Lin Chen, D. Ranjan, M. Zubair, Dong Si, Jing He

引用次数: 3