ISRN bioinformaticsPub Date : 2013-05-25eCollection Date: 2013-01-01DOI: 10.1155/2013/361321
Bryan M H Keng, Oliver Y W Chan, Sean S J Heng, Maurice H T Ling
{"title":"Transcriptome Analysis of Spermophilus lateralis and Spermophilus tridecemlineatus Liver Does Not Suggest the Presence of Spermophilus-Liver-Specific Reference Genes.","authors":"Bryan M H Keng, Oliver Y W Chan, Sean S J Heng, Maurice H T Ling","doi":"10.1155/2013/361321","DOIUrl":"https://doi.org/10.1155/2013/361321","url":null,"abstract":"<p><p>The expressions of reference genes used in gene expression studies are assumed to be stable under most circumstances. However, studies had demonstrated that genes assumed to be stably expressed in a species are not necessarily stably expressed in other organisms. This study aims to evaluate the likelihood of genus-specific reference genes for liver using comparable microarray datasets from Spermophilus lateralis and Spermophilus tridecemlineatus. The coefficient of variance (CV) of each probe was calculated and there were 178 probes common between the lowest 10% CV of both datasets (n = 1258). All 3 lists were analysed by NormFinder. Our results suggest that the most invariant probe for S. tridecemlineatus was 02n12, while that for S. lateralis was 24j21. However, our results showed that Probes 02n12 and 24j21 are ranked 8644 and 926 in terms of invariancy for S. lateralis and S. tridecemlineatus respectively. This suggests the lack of common liver-specific reference probes for both S. lateralis and S. tridecemlineatus. Given that S. lateralis and S. tridecemlineatus are closely related species and the datasets are comparable, our results do not support the presence of genus-specific reference genes. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2013 ","pages":"361321"},"PeriodicalIF":0.0,"publicationDate":"2013-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2013/361321","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33272170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2013-04-18eCollection Date: 2013-01-01DOI: 10.1155/2013/725434
Eran Elhaik, Dan Graur
{"title":"IsoPlotter(+): A Tool for Studying the Compositional Architecture of Genomes.","authors":"Eran Elhaik, Dan Graur","doi":"10.1155/2013/725434","DOIUrl":"https://doi.org/10.1155/2013/725434","url":null,"abstract":"<p><p>Eukaryotic genomes, particularly animal genomes, have a complex, nonuniform, and nonrandom internal compositional organization. The compositional organization of animal genomes can be described as a mosaic of discrete genomic regions, called \"compositional domains,\" each with a distinct GC content that significantly differs from those of its upstream and downstream neighboring domains. A typical animal genome consists of a mixture of compositionally homogeneous and nonhomogeneous domains of varying lengths and nucleotide compositions that are interspersed with one another. We have devised IsoPlotter, an unbiased segmentation algorithm for inferring the compositional organization of genomes. IsoPlotter has become an indispensable tool for describing genomic composition and has been used in the analysis of more than a dozen genomes. Applications include describing new genomes, correlating domain composition with gene composition and their density, studying the evolution of genomes, testing phylogenomic hypotheses, and detect regions of potential interbreeding between human and extinct hominines. To extend the use of IsoPlotter, we designed a completely automated pipeline, called IsoPlotter(+) to carry out all segmentation analyses, including graphical display, and built a repository for compositional domain maps of all fully sequenced vertebrate and invertebrate genomes. The IsoPlotter(+) pipeline and repository offer a comprehensive solution to the study of genome compositional architecture. Here, we demonstrate IsoPlotter(+) by applying it to human and insect genomes. The computational tools and data repository are available online. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2013 ","pages":"725434"},"PeriodicalIF":0.0,"publicationDate":"2013-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393066/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33272129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HMEC: A Heuristic Algorithm for Individual Haplotyping with Minimum Error Correction.","authors":"Md Shamsuzzoha Bayzid, Md Maksudul Alam, Abdullah Mueen, Md Saidur Rahman","doi":"10.1155/2013/291741","DOIUrl":"https://doi.org/10.1155/2013/291741","url":null,"abstract":"<p><p>Haplotype is a pattern of single nucleotide polymorphisms (SNPs) on a single chromosome. Constructing a pair of haplotypes from aligned and overlapping but intermixed and erroneous fragments of the chromosomal sequences is a nontrivial problem. Minimum error correction approach aims to minimize the number of errors to be corrected so that the pair of haplotypes can be constructed through consensus of the fragments. We give a heuristic algorithm (HMEC) that searches through alternative solutions using a gain measure and stops whenever no better solution can be achieved. Time complexity of each iteration is O(m (3) k) for an m × k SNP matrix where m and k are the number of fragments (number of rows) and number of SNP sites (number of columns), respectively, in an SNP matrix. Alternative gain measure is also given to reduce running time. We have compared our algorithm with other methods in terms of accuracy and running time on both simulated and real data, and our extensive experimental results indicate the superiority of our algorithm over others. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2013 ","pages":"291741"},"PeriodicalIF":0.0,"publicationDate":"2013-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2013/291741","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33179456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-12-12eCollection Date: 2012-01-01DOI: 10.5402/2012/371718
Jarrett D Morrow, Brandon W Higgs
{"title":"CallSim: Evaluation of Base Calls Using Sequencing Simulation.","authors":"Jarrett D Morrow, Brandon W Higgs","doi":"10.5402/2012/371718","DOIUrl":"https://doi.org/10.5402/2012/371718","url":null,"abstract":"<p><p>Accurate base calls generated from sequencing data are required for downstream biological interpretation, particularly in the case of rare variants. CallSim is a software application that provides evidence for the validity of base calls believed to be sequencing errors and it is applicable to Ion Torrent and 454 data. The algorithm processes a single read using a Monte Carlo approach to sequencing simulation, not dependent upon information from any other read in the data set. Three examples from general read correction, as well as from error-or-variant classification, demonstrate its effectiveness for a robust low-volume read processing base corrector. Specifically, correction of errors in Ion Torrent reads from a study involving mutations in multidrug resistant Staphylococcus aureus illustrates an ability to classify an erroneous homopolymer call. In addition, support for a rare variant in 454 data for a mixed viral population demonstrates \"base rescue\" capabilities. CallSim provides evidence regarding the validity of base calls in sequences produced by 454 or Ion Torrent systems and is intended for hands-on downstream processing analysis. These downstream efforts, although time consuming, are necessary steps for accurate identification of rare variants. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"371718"},"PeriodicalIF":0.0,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393072/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33272805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-11-21eCollection Date: 2012-01-01DOI: 10.5402/2012/696758
Nelson R Salinas, Damon P Little
{"title":"Electric LAMP: Virtual Loop-Mediated Isothermal AMPlification.","authors":"Nelson R Salinas, Damon P Little","doi":"10.5402/2012/696758","DOIUrl":"https://doi.org/10.5402/2012/696758","url":null,"abstract":"<p><p>We present eLAMP, a PERL script, with Tk graphical interface, that electronically simulates Loop-mediated AMPlification (LAMP) allowing users to efficiently test putative LAMP primers on a set of target sequences. eLAMP can match primers to templates using either exact (via builtin PERL regular expressions) or approximate matching (via the tre-agrep library). Performance was tested on 40 whole genome sequences of Staphylococcus. eLAMP correctly predicted that the two tested primer sets would amplify from S. aureus genomes and not amplify from other Staphylococcus species. Open source (GNU Public License) PERL scripts are available for download from the New York Botanical Garden's website. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"696758"},"PeriodicalIF":0.0,"publicationDate":"2012-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4417551/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33179454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-11-11eCollection Date: 2012-01-01DOI: 10.5402/2012/157135
Debra Knisley, Jeff Knisley, Chelsea Ross, Alissa Rockney
{"title":"Classifying multigraph models of secondary RNA structure using graph-theoretic descriptors.","authors":"Debra Knisley, Jeff Knisley, Chelsea Ross, Alissa Rockney","doi":"10.5402/2012/157135","DOIUrl":"https://doi.org/10.5402/2012/157135","url":null,"abstract":"<p><p>The prediction of secondary RNA folds from primary sequences continues to be an important area of research given the significance of RNA molecules in biological processes such as gene regulation. To facilitate this effort, graph models of secondary structure have been developed to quantify and thereby characterize the topological properties of the secondary folds. In this work we utilize a multigraph representation of a secondary RNA structure to examine the ability of the existing graph-theoretic descriptors to classify all possible topologies as either RNA-like or not RNA-like. We use more than one hundred descriptors and several different machine learning approaches, including nearest neighbor algorithms, one-class classifiers, and several clustering techniques. We predict that many more topologies will be identified as those representing RNA secondary structures than currently predicted in the RAG (RNA-As-Graphs) database. The results also suggest which descriptors and which algorithms are more informative in classifying and exploring secondary RNA structures. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"157135"},"PeriodicalIF":0.0,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.5402/2012/157135","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33173868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-11-11eCollection Date: 2012-01-01DOI: 10.5402/2012/381023
Lars Seemann, Jason Shulman, Gemunu H Gunaratne
{"title":"A robust topology-based algorithm for gene expression profiling.","authors":"Lars Seemann, Jason Shulman, Gemunu H Gunaratne","doi":"10.5402/2012/381023","DOIUrl":"https://doi.org/10.5402/2012/381023","url":null,"abstract":"<p><p>Early and accurate diagnoses of cancer can significantly improve the design of personalized therapy and enhance the success of therapeutic interventions. Histopathological approaches, which rely on microscopic examinations of malignant tissue, are not conducive to timely diagnoses. High throughput genomics offers a possible new classification of cancer subtypes. Unfortunately, most clustering algorithms have not been proven sufficiently robust. We propose a novel approach that relies on the use of statistical invariants and persistent homology, one of the most exciting recent developments in topology. It identifies a sufficient but compact set of genes for the analysis as well as a core group of tightly correlated patient samples for each subtype. Partitioning occurs hierarchically and allows for the identification of genetically similar subtypes. We analyzed the gene expression profiles of 202 tumors of the brain cancer glioblastoma multiforme (GBM) given at the Cancer Genome Atlas (TCGA) site. We identify core patient groups associated with the classical, mesenchymal, and proneural subtypes of GBM. In our analysis, the neural subtype consists of several small groups rather than a single component. A subtype prediction model is introduced which partitions tumors in a manner consistent with clustering algorithms but requires the genetic signature of only 59 genes. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"381023"},"PeriodicalIF":0.0,"publicationDate":"2012-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393071/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33173870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-11-01eCollection Date: 2012-01-01DOI: 10.5402/2012/419419
Roozbeh Manshaei, Pooya Sobhe Bidari, Mahdi Aliyari Shoorehdeli, Amir Feizi, Tahmineh Lohrasebi, Mohammad Ali Malboobi, Matthew Kyan, Javad Alirezaie
{"title":"Hybrid-controlled neurofuzzy networks analysis resulting in genetic regulatory networks reconstruction.","authors":"Roozbeh Manshaei, Pooya Sobhe Bidari, Mahdi Aliyari Shoorehdeli, Amir Feizi, Tahmineh Lohrasebi, Mohammad Ali Malboobi, Matthew Kyan, Javad Alirezaie","doi":"10.5402/2012/419419","DOIUrl":"10.5402/2012/419419","url":null,"abstract":"<p><p>Reverse engineering of gene regulatory networks (GRNs) is the process of estimating genetic interactions of a cellular system from gene expression data. In this paper, we propose a novel hybrid systematic algorithm based on neurofuzzy network for reconstructing GRNs from observational gene expression data when only a medium-small number of measurements are available. The approach uses fuzzy logic to transform gene expression values into qualitative descriptors that can be evaluated by using a set of defined rules. The algorithm uses neurofuzzy network to model genes effects on other genes followed by four stages of decision making to extract gene interactions. One of the main features of the proposed algorithm is that an optimal number of fuzzy rules can be easily and rapidly extracted without overparameterizing. Data analysis and simulation are conducted on microarray expression profiles of S. cerevisiae cell cycle and demonstrate that the proposed algorithm not only selects the patterns of the time series gene expression data accurately, but also provides models with better reconstruction accuracy when compared with four published algorithms: DBNs, VBEM, time delay ARACNE, and PF subjected to LASSO. The accuracy of the proposed approach is evaluated in terms of recall and F-score for the network reconstruction task. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"419419"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393070/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33173871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ISRN bioinformaticsPub Date : 2012-10-16eCollection Date: 2012-01-01DOI: 10.5402/2012/537217
Lingling An, R W Doerge
{"title":"Dynamic clustering of gene expression.","authors":"Lingling An, R W Doerge","doi":"10.5402/2012/537217","DOIUrl":"10.5402/2012/537217","url":null,"abstract":"<p><p>It is well accepted that genes are simultaneously involved in multiple biological processes and that genes are coordinated over the duration of such events. Unfortunately, clustering methodologies that group genes for the purpose of novel gene discovery fail to acknowledge the dynamic nature of biological processes and provide static clusters, even when the expression of genes is assessed across time or developmental stages. By taking advantage of techniques and theories from time frequency analysis, periodic gene expression profiles are dynamically clustered based on the assumption that different spectral frequencies characterize different biological processes. A two-step cluster validation approach is proposed to statistically estimate both the optimal number of clusters and to distinguish significant clusters from noise. The resulting clusters reveal coordinated coexpressed genes. This novel dynamic clustering approach has broad applicability to a vast range of sequential data scenarios where the order of the series is of interest. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"537217"},"PeriodicalIF":0.0,"publicationDate":"2012-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393063/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33179453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differential Expression Analysis for RNA-Seq Data.","authors":"Rashi Gupta, Isha Dewan, Richa Bharti, Alok Bhattacharya","doi":"10.5402/2012/817508","DOIUrl":"https://doi.org/10.5402/2012/817508","url":null,"abstract":"<p><p>RNA-Seq is increasingly being used for gene expression profiling. In this approach, next-generation sequencing (NGS) platforms are used for sequencing. Due to highly parallel nature, millions of reads are generated in a short time and at low cost. Therefore analysis of the data is a major challenge and development of statistical and computational methods is essential for drawing meaningful conclusions from this huge data. In here, we assessed three different types of normalization (transcript parts per million, trimmed mean of M values, quantile normalization) and evaluated if normalized data reduces technical variability across replicates. In addition, we also proposed two novel methods for detecting differentially expressed genes between two biological conditions: (i) likelihood ratio method, and (ii) Bayesian method. Our proposed methods for finding differentially expressed genes were tested on three real datasets. Our methods performed at least as well as, and often better than, the existing methods for analysis of differential expression. </p>","PeriodicalId":90877,"journal":{"name":"ISRN bioinformatics","volume":"2012 ","pages":"817508"},"PeriodicalIF":0.0,"publicationDate":"2012-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4393055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33272167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}