Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )最新文献

筛选
英文 中文
Traversing the k-mer Landscape of NGS Read Datasets for Quality Score Sparsification. 遍历NGS读数据集的k-mer景观,用于质量分数稀疏化。
Y William Yu, Deniz Yorukoglu, Bonnie Berger
{"title":"Traversing the <i>k</i>-mer Landscape of NGS Read Datasets for Quality Score Sparsification.","authors":"Y William Yu,&nbsp;Deniz Yorukoglu,&nbsp;Bonnie Berger","doi":"10.1007/978-3-319-05269-4_31","DOIUrl":"https://doi.org/10.1007/978-3-319-05269-4_31","url":null,"abstract":"<p><p>It is becoming increasingly impractical to indefinitely store raw sequencing data for later processing in an uncompressed state. In this paper, we describe a scalable compressive framework, Read-Quality-Sparsifier (RQS), which substantially outperforms the compression ratio and speed of other de novo quality score compression methods while maintaining SNP-calling accuracy. Surprisingly, RQS also improves the SNP-calling accuracy on a gold-standard, real-life sequencing dataset (NA12878) using a <i>k</i>-mer density profile constructed from 77 other individuals from the 1000 Genomes Project. This improvement in downstream accuracy emerges from the observation that quality score values within NGS datasets are inherently encoded in the <i>k</i>-mer landscape of the genomic sequences. To our knowledge, RQS is the first scalable sequence based quality compression method that can efficiently compress quality scores of terabyte-sized and larger sequencing datasets.</p><p><strong>Availability: </strong>An implementation of our method, RQS, is available for download at: http://rqs.csail.mit.edu/.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"8394 ","pages":"385-399"},"PeriodicalIF":0.0,"publicationDate":"2014-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_31","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35427238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
AptaCluster - A Method to Cluster HT-SELEX Aptamer Pools and Lessons from its Application. 一种聚类HT-SELEX适体池的方法及其应用的教训。
Jan Hoinka, Alexey Berezhnoy, Zuben E Sauna, Eli Gilboa, Teresa M Przytycka
{"title":"AptaCluster - A Method to Cluster HT-SELEX Aptamer Pools and Lessons from its Application.","authors":"Jan Hoinka,&nbsp;Alexey Berezhnoy,&nbsp;Zuben E Sauna,&nbsp;Eli Gilboa,&nbsp;Teresa M Przytycka","doi":"10.1007/978-3-319-05269-4_9","DOIUrl":"https://doi.org/10.1007/978-3-319-05269-4_9","url":null,"abstract":"<p><p>Systematic Evolution of Ligands by EXponential Enrichment (SELEX) is a well established experimental procedure to identify aptamers - synthetic single-stranded (ribo)nucleic molecules that bind to a given molecular target. Recently, new sequencing technologies have revolutionized the SELEX protocol by allowing for deep sequencing of the selection pools after each cycle. The emergence of High Throughput SELEX (HT-SELEX) has opened the field to new computational opportunities and challenges that are yet to be addressed. To aid the analysis of the results of HT-SELEX and to advance the understanding of the selection process itself, we developed AptaCluster. This algorithm allows for an efficient clustering of whole HT-SELEX aptamer pools; a task that could not be accomplished with traditional clustering algorithms due to the enormous size of such datasets. We performed HT-SELEX with Interleukin 10 receptor alpha chain (IL-10RA) as the target molecule and used AptaCluster to analyze the resulting sequences. AptaCluster allowed for the first survey of the relationships between sequences in different selection rounds and revealed previously not appreciated properties of the SELEX protocol. As the first tool of this kind, AptaCluster enables novel ways to analyze and to optimize the HT-SELEX procedure. Our AptaCluster algorithm is available as a very fast multiprocessor implementation upon request.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"8394 ","pages":"115-128"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32949306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Learning Sequence Determinants of Protein:protein Interaction Specificity with Sparse Graphical Models. 蛋白质的学习序列决定因素:与稀疏图形模型的蛋白质相互作用特异性。
Hetunandan Kamisetty, Bornika Ghosh, Christopher James Langmead, Chris Bailey-Kellogg
{"title":"Learning Sequence Determinants of Protein:protein Interaction Specificity with Sparse Graphical Models.","authors":"Hetunandan Kamisetty,&nbsp;Bornika Ghosh,&nbsp;Christopher James Langmead,&nbsp;Chris Bailey-Kellogg","doi":"10.1007/978-3-319-05269-4_10","DOIUrl":"https://doi.org/10.1007/978-3-319-05269-4_10","url":null,"abstract":"<p><p>In studying the strength and specificity of interaction between members of two protein families, key questions center on <i>which</i> pairs of possible partners actually interact, <i>how well</i> they interact, and <i>why</i> they interact while others do not. The advent of large-scale experimental studies of interactions between members of a target family and a diverse set of possible interaction partners offers the opportunity to address these questions. We develop here a method, DgSpi (Data-driven Graphical models of Specificity in Protein:protein Interactions), for learning and using graphical models that explicitly represent the amino acid basis for interaction specificity (<i>why</i>) and extend earlier classification-oriented approaches (<i>which</i>) to predict the Δ<i>G</i> of binding (<i>how well</i>). We demonstrate the effectiveness of our approach in analyzing and predicting interactions between a set of 82 PDZ recognition modules, against a panel of 217 possible peptide partners, based on data from MacBeath and colleagues. Our predicted Δ<i>G</i> values are highly predictive of the experimentally measured ones, reaching correlation coefficients of 0.69 in 10-fold cross-validation and 0.63 in leave-one-PDZ-out cross-validation. Furthermore, the model serves as a compact representation of amino acid constraints underlying the interactions, enabling protein-level Δ<i>G</i> predictions to be naturally understood in terms of residue-level constraints. Finally, as a generative model, DgSpi readily enables the design of new interacting partners, and we demonstrate that designed ligands are novel and diverse.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"8394 ","pages":"129-143"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_10","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32830491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Decoding coalescent hidden Markov models in linear time. 在线性时间内解码聚结隐马尔可夫模型。
Kelley Harris, Sara Sheehan, John A Kamm, Yun S Song
{"title":"Decoding coalescent hidden Markov models in linear time.","authors":"Kelley Harris,&nbsp;Sara Sheehan,&nbsp;John A Kamm,&nbsp;Yun S Song","doi":"10.1007/978-3-319-05269-4_8","DOIUrl":"https://doi.org/10.1007/978-3-319-05269-4_8","url":null,"abstract":"<p><p>In many areas of computational biology, hidden Markov models (HMMs) have been used to model local genomic features. In particular, coalescent HMMs have been used to infer ancient population sizes, migration rates, divergence times, and other parameters such as mutation and recombination rates. As more loci, sequences, and hidden states are added to the model, however, the runtime of coalescent HMMs can quickly become prohibitive. Here we present a new algorithm for reducing the runtime of coalescent HMMs from quadratic in the number of hidden time states to linear, without making any additional approximations. Our algorithm can be incorporated into various coalescent HMMs, including the popular method PSMC for inferring variable effective population sizes. Here we implement this algorithm to speed up our demographic inference method diCal, which is equivalent to PSMC when applied to a sample of two haplotypes. We demonstrate that the linear-time method can reconstruct a population size change history more accurately than the quadratic-time method, given similar computation resources. We also apply the method to data from the 1000 Genomes project, inferring a high-resolution history of size changes in the European population.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"8394 ","pages":"100-114"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-319-05269-4_8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32766513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Fast and Accurate Calculation of Protein Depth by Euclidean Distance Transform. 欧几里得距离变换快速准确地计算蛋白质深度。
Dong Xu, Hua Li, Yang Zhang
{"title":"Fast and Accurate Calculation of Protein Depth by Euclidean Distance Transform.","authors":"Dong Xu,&nbsp;Hua Li,&nbsp;Yang Zhang","doi":"10.1007/978-3-642-37195-0_30","DOIUrl":"https://doi.org/10.1007/978-3-642-37195-0_30","url":null,"abstract":"<p><p>The depth of each atom/residue in a protein structure is a key attribution that has been widely used in protein structure modeling and function annotation. However, the accurate calculation of depth is time consuming. Here, we propose to use the Euclidean distance transform (EDT) to calculate the depth, which conveniently converts the protein structure to a 3D gray-scale image with each pixel labeling the minimum distance of the pixel to the surface of the molecule (i.e. the depth). We tested the proposed EDT method on a set of 261 non-redundant protein structures. The data show that the EDT method is 2.6 times faster than the widely used method by Chakravarty and Varadarajan. The depth value by EDT method is also highly accurate, which is almost identical to the depth calculated by exhaustive search (Pearson's correlation coefficient≈1). We believe the EDT-based depth calculation program can be used as an efficient tool to assist the studies of protein fold recognition and structure-based function annotation.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"7821 ","pages":"304-316"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4098708/pdf/nihms592637.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32513516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Combinatorics of Genome Rearrangements 基因组重排组合学
G. Fertin, A. Labarre, I. Rusu, Éric Tannier, Stéphane Vialette
{"title":"Combinatorics of Genome Rearrangements","authors":"G. Fertin, A. Labarre, I. Rusu, Éric Tannier, Stéphane Vialette","doi":"10.7551/mitpress/9780262062824.001.0001","DOIUrl":"https://doi.org/10.7551/mitpress/9780262062824.001.0001","url":null,"abstract":"From one cell to another, from one individual to another, and from one species to another, the content of DNA molecules is often similar. The organization of these molecules, however, differs dramatically, and the mutations that affect this organization are known as genome rearrangements. Combinatorial methods are used to reconstruct putative rearrangement scenarios in order to explain the evolutionary history of a set of species, often formalizing the evolutionary events that can explain the multiple combinations of observed genomes as combinatorial optimization problems. This book offers the first comprehensive survey of this rapidly expanding application of combinatorial optimization. It can be used as a reference for experienced researchers or as an introductory text for a broader audience. Genome rearrangement problems have proved so interesting from a combinatorial point of view that the field now belongs as much to mathematics as to biology. This book takes a mathematically oriented approach, but provides biological background when necessary. It presents a series of models, beginning with the simplest (which is progressively extended by dropping restrictions), each constructing a genome rearrangement problem. The book also discusses an important generalization of the basic problem known as the median problem, surveys attempts to reconstruct the relationships between genomes with phylogenetic trees, and offers a collection of summaries and appendixes with useful additional information. Computational Molecular Biology series","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"28 1","pages":"I-XI, 1-288"},"PeriodicalIF":0.0,"publicationDate":"2009-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84613799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 326
Boosting Protein Threading Accuracy. 提高蛋白质穿线精度。
Jian Peng, Jinbo Xu
{"title":"Boosting Protein Threading Accuracy.","authors":"Jian Peng,&nbsp;Jinbo Xu","doi":"10.1007/978-3-642-02008-7_3","DOIUrl":"https://doi.org/10.1007/978-3-642-02008-7_3","url":null,"abstract":"<p><p>Protein threading is one of the most successful protein structure prediction methods. Most protein threading methods use a scoring function linearly combining sequence and structure features to measure the quality of a sequence-template alignment so that a dynamic programming algorithm can be used to optimize the scoring function. However, a linear scoring function cannot fully exploit interdependency among features and thus, limits alignment accuracy.This paper presents a nonlinear scoring function for protein threading, which not only can model interactions among different protein features, but also can be efficiently optimized using a dynamic programming algorithm. We achieve this by modeling the threading problem using a probabilistic graphical model Conditional Random Fields (CRF) and training the model using the gradient tree boosting algorithm. The resultant model is a nonlinear scoring function consisting of a collection of regression trees. Each regression tree models a type of nonlinear relationship among sequence and structure features. Experimental results indicate that this new threading model can effectively leverage weak biological signals and improve both alignment accuracy and fold recognition rate greatly.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"5541 ","pages":"31-45"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-642-02008-7_3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30576321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
A Probabilistic Graphical Model for Ab Initio Folding. 从头开始折叠的概率图模型。
Feng Zhao, Jian Peng, Joe Debartolo, Karl F Freed, Tobin R Sosnick, Jinbo Xu
{"title":"A Probabilistic Graphical Model for Ab Initio Folding.","authors":"Feng Zhao,&nbsp;Jian Peng,&nbsp;Joe Debartolo,&nbsp;Karl F Freed,&nbsp;Tobin R Sosnick,&nbsp;Jinbo Xu","doi":"10.1007/978-3-642-02008-7_5","DOIUrl":"https://doi.org/10.1007/978-3-642-02008-7_5","url":null,"abstract":"<p><p>Despite significant progress in recent years, <i>ab initio</i> folding is still one of the most challenging problems in structural biology. This paper presents a probabilistic graphical model for ab initio folding, which employs Conditional Random Fields (CRFs) and directional statistics to model the relationship between the primary sequence of a protein and its three-dimensional structure. Different from the widely-used fragment assembly method and the lattice model for protein folding, our graphical model can explore protein conformations in a continuous space according to their probability. The probability of a protein conformation reflects its stability and is estimated from PSI-BLAST sequence profile and predicted secondary structure. Experimental results indicate that this new method compares favorably with the fragment assembly method and the lattice model.</p>","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"5541 ","pages":"59-73"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/978-3-642-02008-7_5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31281362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Structural Alignment of Pseudoknotted RNA 假结RNA的结构定位
Banu Dost, B. Han, Shaojie Zhang, V. Bafna
{"title":"Structural Alignment of Pseudoknotted RNA","authors":"Banu Dost, B. Han, Shaojie Zhang, V. Bafna","doi":"10.1007/11732990_13","DOIUrl":"https://doi.org/10.1007/11732990_13","url":null,"abstract":"","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"75 1","pages":"143 - 158"},"PeriodicalIF":0.0,"publicationDate":"2006-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86378525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Immunological bioinformatics 免疫生物信息学
O. Lund, M. Nielsen, C. Lundegaard, C. Keşmir, S. Brunak
{"title":"Immunological bioinformatics","authors":"O. Lund, M. Nielsen, C. Lundegaard, C. Keşmir, S. Brunak","doi":"10.7551/mitpress/3679.001.0001","DOIUrl":"https://doi.org/10.7551/mitpress/3679.001.0001","url":null,"abstract":"","PeriodicalId":74675,"journal":{"name":"Research in computational molecular biology : ... Annual International Conference, RECOMB ... : proceedings. RECOMB (Conference : 2005- )","volume":"11 1","pages":"I-XII, 1-296"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79254056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信