Algorithms for Molecular Biology最新文献

筛选
英文 中文
Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph. 快速和有效的Rmap装配使用双标记德布鲁因图。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-05-25 DOI: 10.1186/s13015-021-00182-9
Kingshuk Mukherjee, Massimiliano Rossi, Leena Salmela, Christina Boucher
{"title":"Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph.","authors":"Kingshuk Mukherjee,&nbsp;Massimiliano Rossi,&nbsp;Leena Salmela,&nbsp;Christina Boucher","doi":"10.1186/s13015-021-00182-9","DOIUrl":"https://doi.org/10.1186/s13015-021-00182-9","url":null,"abstract":"<p><p>Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which are called Rmaps. Unfortunately, there are very few choices for assembling Rmap data. There exists only one publicly-available non-proprietary method for assembly and one proprietary software that is available via an executable. Furthermore, the publicly-available method, by Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006), follows the overlap-layout-consensus (OLC) paradigm, and therefore, is unable to scale for relatively large genomes. The algorithm behind the proprietary method, Bionano Genomics' Solve, is largely unknown. In this paper, we extend the definition of bi-labels in the paired de Bruijn graph to the context of optical mapping data, and present the first de Bruijn graph based method for Rmap assembly. We implement our approach, which we refer to as RMAPPER, and compare its performance against the assembler of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) and Solve by Bionano Genomics on data from three genomes: E. coli, human, and climbing perch fish (Anabas Testudineus). Our method was able to successfully run on all three genomes. The method of Valouev et al. (Proc Natl Acad Sci USA 103(43):15770-15775, 2006) only successfully ran on E. coli. Moreover, on the human genome RMAPPER was at least 130 times faster than Bionano Solve, used five times less memory and produced the highest genome fraction with zero mis-assemblies. Our software, RMAPPER is written in C++ and is publicly available under GNU General Public License at https://github.com/kingufl/Rmapper .</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"16 1","pages":"6"},"PeriodicalIF":1.0,"publicationDate":"2021-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-021-00182-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39017832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Exact transcript quantification over splice graphs. 精确转录定量剪接图。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-05-10 DOI: 10.1186/s13015-021-00184-7
Cong Ma, Hongyu Zheng, Carl Kingsford
{"title":"Exact transcript quantification over splice graphs.","authors":"Cong Ma,&nbsp;Hongyu Zheng,&nbsp;Carl Kingsford","doi":"10.1186/s13015-021-00184-7","DOIUrl":"https://doi.org/10.1186/s13015-021-00184-7","url":null,"abstract":"<p><strong>Background: </strong>The probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model graph quantification, which was first proposed by Bernard et al. (Bioinformatics 30:2447-55, 2014). The model can be viewed as a generalization of transcript expression quantification where every full path in the splice graph is a possible transcript. However, the previous graph quantification model assumes the length of single-end reads or paired-end fragments is fixed.</p><p><strong>Results: </strong>We provide an improvement of this model to handle variable-length reads or fragments and incorporate bias correction. We prove that our model is equivalent to running a transcript quantifier with exactly the set of all compatible transcripts. The key to our method is constructing an extension of the splice graph based on Aho-Corasick automata. The proof of equivalence is based on a novel reparameterization of the read generation model of a state-of-art transcript quantification method.</p><p><strong>Conclusion: </strong>We propose a new approach for graph quantification, which is useful for modeling scenarios where reference transcriptome is incomplete or not available and can be further used in transcriptome assembly or alternative splicing analysis.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"16 1","pages":"5"},"PeriodicalIF":1.0,"publicationDate":"2021-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-021-00184-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38968170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics 树的饮食:减少树的宽度解锁RNA生物信息学中的FPT算法
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-05-04 DOI: 10.1186/s13015-022-00213-z
Bertrand Marchand, Y. Ponty, L. Bulteau
{"title":"Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics","authors":"Bertrand Marchand, Y. Ponty, L. Bulteau","doi":"10.1186/s13015-022-00213-z","DOIUrl":"https://doi.org/10.1186/s13015-022-00213-z","url":null,"abstract":"","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"17 1","pages":""},"PeriodicalIF":1.0,"publicationDate":"2021-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65742120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving metagenomic binning results with overlapped bins using assembly graphs. 使用装配图改进重叠箱的宏基因组分类结果。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-05-04 DOI: 10.1186/s13015-021-00185-6
Vijini G Mallawaarachchi, Anuradha S Wickramarachchi, Yu Lin
{"title":"Improving metagenomic binning results with overlapped bins using assembly graphs.","authors":"Vijini G Mallawaarachchi,&nbsp;Anuradha S Wickramarachchi,&nbsp;Yu Lin","doi":"10.1186/s13015-021-00185-6","DOIUrl":"https://doi.org/10.1186/s13015-021-00185-6","url":null,"abstract":"<p><strong>Background: </strong>Metagenomic sequencing allows us to study the structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from metagenomics sequencing are first assembled into longer contigs and these contigs are then binned into clusters of contigs where contigs in a cluster are expected to come from the same species. As different species may share common sequences in their genomes, one assembled contig may belong to multiple species. However, existing tools for binning contigs only support non-overlapped binning, i.e., each contig is assigned to at most one bin (species).</p><p><strong>Results: </strong>In this paper, we introduce GraphBin2 which refines the binning results obtained from existing tools and, more importantly, is able to assign contigs to multiple bins. GraphBin2 uses the connectivity and coverage information from assembly graphs to adjust existing binning results on contigs and to infer contigs shared by multiple species. Experimental results on both simulated and real datasets demonstrate that GraphBin2 not only improves binning results of existing tools but also supports to assign contigs to multiple bins.</p><p><strong>Conclusion: </strong>GraphBin2 incorporates the coverage information into the assembly graph to refine the binning results obtained from existing binning tools. GraphBin2 also enables the detection of contigs that may belong to multiple species. We show that GraphBin2 outperforms its predecessor GraphBin on both simulated and real datasets. GraphBin2 is freely available at https://github.com/Vini2/GraphBin2 .</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"16 1","pages":"3"},"PeriodicalIF":1.0,"publicationDate":"2021-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-021-00185-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38869054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Fast lightweight accurate xenograft sorting. 快速、轻量、准确的异种移植物分选。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-04-02 DOI: 10.1186/s13015-021-00181-w
Jens Zentgraf, Sven Rahmann
{"title":"Fast lightweight accurate xenograft sorting.","authors":"Jens Zentgraf,&nbsp;Sven Rahmann","doi":"10.1186/s13015-021-00181-w","DOIUrl":"https://doi.org/10.1186/s13015-021-00181-w","url":null,"abstract":"<p><strong>Motivation: </strong>With an increasing number of patient-derived xenograft (PDX) models being created and subsequently sequenced to study tumor heterogeneity and to guide therapy decisions, there is a similarly increasing need for methods to separate reads originating from the graft (human) tumor and reads originating from the host species' (mouse) surrounding tissue. Two kinds of methods are in use: On the one hand, alignment-based tools require that reads are mapped and aligned (by an external mapper/aligner) to the host and graft genomes separately first; the tool itself then processes the resulting alignments and quality metrics (typically BAM files) to assign each read or read pair. On the other hand, alignment-free tools work directly on the raw read data (typically FASTQ files). Recent studies compare different approaches and tools, with varying results.</p><p><strong>Results: </strong>We show that alignment-free methods for xenograft sorting are superior concerning CPU time usage and equivalent in accuracy. We improve upon the state of the art sorting by presenting a fast lightweight approach based on three-way bucketed quotiented Cuckoo hashing. Our hash table requires memory comparable to an FM index typically used for read alignment and less than other alignment-free approaches. It allows extremely fast lookups and uses less CPU time than other alignment-free methods and alignment-based methods at similar accuracy. Several engineering steps (e.g., shortcuts for unsuccessful lookups, software prefetching) improve the performance even further.</p><p><strong>Availability: </strong>Our software xengsort is available under the MIT license at http://gitlab.com/genomeinformatics/xengsort . It is written in numba-compiled Python and comes with sample Snakemake workflows for hash table construction and dataset processing.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"16 1","pages":"2"},"PeriodicalIF":1.0,"publicationDate":"2021-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-021-00181-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25554318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Quantifying steric hindrance and topological obstruction to protein structure superposition. 定量的位阻和拓扑阻对蛋白质结构叠加的影响。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2021-02-27 DOI: 10.1186/s13015-020-00180-3
Peter Røgen
{"title":"Quantifying steric hindrance and topological obstruction to protein structure superposition.","authors":"Peter Røgen","doi":"10.1186/s13015-020-00180-3","DOIUrl":"https://doi.org/10.1186/s13015-020-00180-3","url":null,"abstract":"<p><strong>Background: </strong>In computational structural biology, structure comparison is fundamental for our understanding of proteins. Structure comparison is, e.g., algorithmically the starting point for computational studies of structural evolution and it guides our efforts to predict protein structures from their amino acid sequences. Most methods for structural alignment of protein structures optimize the distances between aligned and superimposed residue pairs, i.e., the distances traveled by the aligned and superimposed residues during linear interpolation. Considering such a linear interpolation, these methods do not differentiate if there is room for the interpolation, if it causes steric clashes, or more severely, if it changes the topology of the compared protein backbone curves.</p><p><strong>Results: </strong>To distinguish such cases, we analyze the linear interpolation between two aligned and superimposed backbones. We quantify the amount of steric clashes and find all self-intersections in a linear backbone interpolation. To determine if the self-intersections alter the protein's backbone curve significantly or not, we present a path-finding algorithm that checks if there exists a self-avoiding path in a neighborhood of the linear interpolation. A new path is constructed by altering the linear interpolation using a novel interpretation of Reidemeister moves from knot theory working on three-dimensional curves rather than on knot diagrams. Either the algorithm finds a self-avoiding path or it returns a smallest set of essential self-intersections. Each of these indicates a significant difference between the folds of the aligned protein structures. As expected, we find at least one essential self-intersection separating most unknotted structures from a knotted structure, and we find even larger motions in proteins connected by obstruction free linear interpolations. We also find examples of homologous proteins that are differently threaded, and we find many distinct folds connected by longer but simple deformations. TM-align is one of the most restrictive alignment programs. With standard parameters, it only aligns residues superimposed within 5 Ångström distance. We find 42165 topological obstructions between aligned parts in 142068 TM-alignments. Thus, this restrictive alignment procedure still allows topological dissimilarity of the aligned parts.</p><p><strong>Conclusions: </strong>Based on the data we conclude that our program ProteinAlignmentObstruction provides significant additional information to alignment scores based solely on distances between aligned and superimposed residue pairs.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"16 1","pages":"1"},"PeriodicalIF":1.0,"publicationDate":"2021-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-020-00180-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25411733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections. 为字符串集合构造后缀数组、LCP数组和bwt。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2020-09-22 eCollection Date: 2020-01-01 DOI: 10.1186/s13015-020-00177-y
Felipe A Louza, Guilherme P Telles, Simon Gog, Nicola Prezza, Giovanna Rosone
{"title":"gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections.","authors":"Felipe A Louza,&nbsp;Guilherme P Telles,&nbsp;Simon Gog,&nbsp;Nicola Prezza,&nbsp;Giovanna Rosone","doi":"10.1186/s13015-020-00177-y","DOIUrl":"https://doi.org/10.1186/s13015-020-00177-y","url":null,"abstract":"<p><strong>Background: </strong>The construction of a suffix array for a collection of strings is a fundamental task in Bioinformatics and in many other applications that process strings. Related data structures, as the Longest Common Prefix array, the Burrows-Wheeler transform, and the document array, are often needed to accompany the suffix array to efficiently solve a wide variety of problems. While several algorithms have been proposed to construct the suffix array for a single string, less emphasis has been put on algorithms to construct suffix arrays for string collections.</p><p><strong>Result: </strong>In this paper we introduce gsufsort, an open source software for constructing the suffix array and related data indexing structures for a string collection with <i>N</i> symbols in <i>O</i>(<i>N</i>) time. Our tool is written in ANSI/C and is based on the algorithm gSACA-K (Louza et al. in Theor Comput Sci 678:22-39, 2017), the fastest algorithm to construct suffix arrays for string collections. The tool supports large fasta, fastq and text files with multiple strings as input. Experiments have shown very good performance on different types of strings.</p><p><strong>Conclusions: </strong>gsufsort is a fast, portable, and lightweight tool for constructing the suffix array and additional data structures for string collections.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"15 ","pages":"18"},"PeriodicalIF":1.0,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-020-00177-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38417629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A linear-time algorithm that avoids inverses and computes Jackknife (leave-one-out) products like convolutions or other operators in commutative semigroups. 一种线性时间算法,它避免了逆运算,并计算可交换半群中的折刀(留一)积,如卷积或其他算子。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2020-09-19 eCollection Date: 2020-01-01 DOI: 10.1186/s13015-020-00178-x
John L Spouge, Joseph M Ziegelbauer, Mileidy Gonzalez
{"title":"A linear-time algorithm that avoids inverses and computes Jackknife (leave-one-out) products like convolutions or other operators in commutative semigroups.","authors":"John L Spouge,&nbsp;Joseph M Ziegelbauer,&nbsp;Mileidy Gonzalez","doi":"10.1186/s13015-020-00178-x","DOIUrl":"https://doi.org/10.1186/s13015-020-00178-x","url":null,"abstract":"&lt;p&gt;&lt;strong&gt;Background: &lt;/strong&gt;Data about herpesvirus microRNA motifs on human circular RNAs suggested the following statistical question. Consider independent random counts, not necessarily identically distributed. Conditioned on the sum, decide whether one of the counts is unusually large. Exact computation of the p&lt;i&gt;-&lt;/i&gt;value leads to a specific algorithmic problem. Given &lt;math&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt; elements &lt;math&gt; &lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt; &lt;mo&gt;,&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt; &lt;mo&gt;,&lt;/mo&gt; &lt;mo&gt;…&lt;/mo&gt; &lt;mo&gt;,&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt; &lt;mo&gt;-&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;/mrow&gt; &lt;/math&gt; in a set &lt;math&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;/math&gt; with the closure and associative properties and a commutative product without inverses, compute the jackknife (leave-one-out) products &lt;math&gt; &lt;mrow&gt; &lt;msub&gt; &lt;mover&gt;&lt;mrow&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;/mrow&gt; &lt;mrow&gt;&lt;mo&gt;¯&lt;/mo&gt;&lt;/mrow&gt; &lt;/mover&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt; &lt;mo&gt;=&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mn&gt;0&lt;/mn&gt;&lt;/msub&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt; &lt;mo&gt;⋯&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt; &lt;mo&gt;-&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt; &lt;mo&gt;+&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;mo&gt;⋯&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt; &lt;mo&gt;-&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;/mrow&gt; &lt;/math&gt; ( &lt;math&gt;&lt;mrow&gt;&lt;mn&gt;0&lt;/mn&gt; &lt;mo&gt;≤&lt;/mo&gt; &lt;mi&gt;j&lt;/mi&gt; &lt;mo&gt;&lt;&lt;/mo&gt; &lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt; &lt;/math&gt; ).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Results: &lt;/strong&gt;This article gives a linear-time Jackknife Product algorithm. Its upward phase constructs a standard segment tree for computing segment products like &lt;math&gt; &lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mfenced&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt; &lt;mo&gt;,&lt;/mo&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt; &lt;/mfenced&gt; &lt;/msub&gt; &lt;mo&gt;=&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt; &lt;mo&gt;+&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;mo&gt;⋯&lt;/mo&gt; &lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt; &lt;mo&gt;-&lt;/mo&gt; &lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt; &lt;/msub&gt; &lt;/mrow&gt; &lt;/math&gt; ; its novel downward phase mirrors the upward phase while exploiting the symmetry of &lt;math&gt;&lt;msub&gt;&lt;mi&gt;g&lt;/mi&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt; &lt;/math&gt; and its complement &lt;math&gt; &lt;msub&gt; &lt;mover&gt;&lt;mrow&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;/mrow&gt; &lt;mrow&gt;&lt;mo&gt;¯&lt;/mo&gt;&lt;/mrow&gt; &lt;/mover&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt; &lt;/math&gt; . The algorithm requires storage for &lt;math&gt;&lt;mrow&gt;&lt;mn&gt;2&lt;/mn&gt; &lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt; &lt;/math&gt; elements of &lt;math&gt;&lt;mi&gt;G&lt;/mi&gt;&lt;/math&gt; and only about &lt;math&gt;&lt;mrow&gt;&lt;mn&gt;3&lt;/mn&gt; &lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt; &lt;/math&gt; products. In contrast, the standard segment tree algorithms require about &lt;math&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/math&gt; products for construction and &lt;math&gt; &lt;mrow&gt;&lt;msub&gt;&lt;mo&gt;log&lt;/mo&gt; &lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt; &lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt; &lt;/math&gt; products for calculating each &lt;math&gt; &lt;msub&gt; &lt;mover&gt;&lt;mrow&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;/mrow&gt; &lt;mrow&gt;&lt;mo&gt;¯&lt;/mo&gt;&lt;/mrow&gt; &lt;/mover&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt; &lt;/math&gt; , i.e., about &lt;math&gt;&lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt; &lt;msub&gt;&lt;mo&gt;log&lt;/mo&gt; &lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt; &lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt; &lt;/math&gt; products in total; and a naïve quadratic algorithm using &lt;math&gt;&lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt; &lt;mo&gt;-&lt;/mo&gt; &lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt; &lt;/math&gt; element-by-element products to compute each &lt;math&gt; &lt;msub&gt; &lt;mover&gt;&lt;mrow&gt;&lt;mi&gt;g&lt;/mi&gt;&lt;/mrow&gt; &lt;mrow&gt;&lt;mo&gt;¯&lt;/mo&gt;&lt;/mrow&gt; &lt;/mover&gt; &lt;mi&gt;j&lt;/mi&gt;&lt;/msub&gt; &lt;/math&gt; requires &lt;math&gt;&lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt; &lt;mf","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"15 ","pages":"17"},"PeriodicalIF":1.0,"publicationDate":"2020-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-020-00178-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38415649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On an enhancement of RNA probing data using information theory. 利用信息论增强RNA探测数据。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2020-08-07 eCollection Date: 2020-01-01 DOI: 10.1186/s13015-020-00176-z
Thomas J X Li, Christian M Reidys
{"title":"On an enhancement of RNA probing data using information theory.","authors":"Thomas J X Li,&nbsp;Christian M Reidys","doi":"10.1186/s13015-020-00176-z","DOIUrl":"https://doi.org/10.1186/s13015-020-00176-z","url":null,"abstract":"<p><p>Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via chemical probing data. We employ an information-theoretic approach to solve the problem, via considering a variant of the Rényi-Ulam game. Our framework is centered around the ensemble tree, a hierarchical bi-partition of the input ensemble, that is constructed by recursively querying about whether or not a base pair of maximum information entropy is contained in the target. These queries are answered via relating local with global probing data, employing the modularity in RNA secondary structures. We present that leaves of the tree are comprised of sub-samples exhibiting a distinguished structure with high probability. In particular, for a Boltzmann ensemble incorporating probing data, which is well established in the literature, the probability of our framework correctly identifying the target in the leaf is greater than <math><mrow><mn>90</mn> <mo>%</mo></mrow> </math> .</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"15 ","pages":"15"},"PeriodicalIF":1.0,"publicationDate":"2020-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-020-00176-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38252398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Algorithms for the quantitative Lock/Key model of cytoplasmic incompatibility. 细胞质不相容定量锁/键模型的算法。
IF 1 4区 生物学
Algorithms for Molecular Biology Pub Date : 2020-07-22 eCollection Date: 2020-01-01 DOI: 10.1186/s13015-020-00174-1
Tiziana Calamoneri, Mattia Gastaldello, Arnaud Mary, Marie-France Sagot, Blerina Sinaimeri
{"title":"Algorithms for the quantitative Lock/Key model of cytoplasmic incompatibility.","authors":"Tiziana Calamoneri,&nbsp;Mattia Gastaldello,&nbsp;Arnaud Mary,&nbsp;Marie-France Sagot,&nbsp;Blerina Sinaimeri","doi":"10.1186/s13015-020-00174-1","DOIUrl":"https://doi.org/10.1186/s13015-020-00174-1","url":null,"abstract":"<p><p>Cytoplasmic incompatibility (CI) relates to the manipulation by the parasite <i>Wolbachia</i> of its host reproduction. Despite its widespread occurrence, the molecular basis of CI remains unclear and theoretical models have been proposed to understand the phenomenon. We consider in this paper the quantitative Lock-Key model which currently represents a good hypothesis that is consistent with the data available. CI is in this case modelled as the problem of covering the edges of a bipartite graph with the minimum number of chain subgraphs. This problem is already known to be NP-hard, and we provide an exponential algorithm with a non trivial complexity. It is frequent that depending on the dataset, there may be many optimal solutions which can be biologically quite different among them. To rely on a single optimal solution may therefore be problematic. To this purpose, we address the problem of enumerating (listing) all minimal chain subgraph covers of a bipartite graph and show that it can be solved in quasi-polynomial time. Interestingly, in order to solve the above problems, we considered also the problem of enumerating all the maximal chain subgraphs of a bipartite graph and improved on the current results in the literature for the latter. Finally, to demonstrate the usefulness of our methods we show an application on a real dataset.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":"15 ","pages":"14"},"PeriodicalIF":1.0,"publicationDate":"2020-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-020-00174-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38186822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信