{"title":"An Optimal Seed Based Compression Algorithm for DNA Sequences.","authors":"Pamela Vinitha Eric, Gopakumar Gopalakrishnan, Muralikrishnan Karunakaran","doi":"10.1155/2016/3528406","DOIUrl":"10.1155/2016/3528406","url":null,"abstract":"<p><p>This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"3528406"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4983397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34331563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malik Yousef, Müşerref Duygu Saçar Demirci, Waleed Khalifa, Jens Allmer
{"title":"Feature Selection Has a Large Impact on One-Class Classification Accuracy for MicroRNAs in Plants.","authors":"Malik Yousef, Müşerref Duygu Saçar Demirci, Waleed Khalifa, Jens Allmer","doi":"10.1155/2016/5670851","DOIUrl":"https://doi.org/10.1155/2016/5670851","url":null,"abstract":"<p><p>MicroRNAs (miRNAs) are short RNA sequences involved in posttranscriptional gene regulation. Their experimental analysis is complicated and, therefore, needs to be supplemented with computational miRNA detection. Currently computational miRNA detection is mainly performed using machine learning and in particular two-class classification. For machine learning, the miRNAs need to be parametrized and more than 700 features have been described. Positive training examples for machine learning are readily available, but negative data is hard to come by. Therefore, it seems prerogative to use one-class classification instead of two-class classification. Previously, we were able to almost reach two-class classification accuracy using one-class classifiers. In this work, we employ feature selection procedures in conjunction with one-class classification and show that there is up to 36% difference in accuracy among these feature selection methods. The best feature set allowed the training of a one-class classifier which achieved an average accuracy of ~95.6% thereby outperforming previous two-class-based plant miRNA detection approaches by about 0.5%. We believe that this can be improved upon in the future by rigorous filtering of the positive training examples and by improving current feature clustering algorithms to better target pre-miRNA feature selection. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"5670851"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/5670851","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34401639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Molecular Docking and In Silico ADMET Study Reveals Acylguanidine 7a as a Potential Inhibitor of β-Secretase.","authors":"Chaluveelaveedu Murleedharan Nisha, Ashwini Kumar, Prateek Nair, Nityasha Gupta, Chitrangda Silakari, Timir Tripathi, Awanish Kumar","doi":"10.1155/2016/9258578","DOIUrl":"https://doi.org/10.1155/2016/9258578","url":null,"abstract":"<p><p>Amyloidogenic pathway in Alzheimer's disease (AD) involves breakdown of APP by β-secretase followed by γ-secretase and results in formation of amyloid beta plaque. β-secretase has been a promising target for developing novel anti-Alzheimer drugs. To test different molecules for this purpose, test ligands like acylguanidine 7a, rosiglitazone, pioglitazone, and tartaric acid were docked against our target protein β-secretase enzyme retrieved from Protein Data Bank, considering MK-8931 (phase III trial, Merck) as the positive control. Docking revealed that, with respect to their free binding energy, acylguanidine 7a has the lowest binding energy followed by MK-8931 and pioglitazone and binds significantly to β-secretase. In silico ADMET predictions revealed that except tartaric acid all other compounds had minimal toxic effects and had good absorption as well as solubility characteristics. These compounds may serve as potential lead compound for developing new anti-Alzheimer drug. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"9258578"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/9258578","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34401640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information.","authors":"Atiyeh Mortazavi, Mohammad Hossein Moattar","doi":"10.1155/2016/1058305","DOIUrl":"https://doi.org/10.1155/2016/1058305","url":null,"abstract":"<p><p>High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI) for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches. </p>","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2016 ","pages":"1058305"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2016/1058305","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34440167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Local Mutational Pressures in Genomes of Zaire Ebolavirus and Marburg Virus","authors":"V. V. Khrustalev, E. V. Barkovsky, T. Khrustaleva","doi":"10.1155/2015/678587","DOIUrl":"https://doi.org/10.1155/2015/678587","url":null,"abstract":"Heterogeneities in nucleotide content distribution along the length of Zaire ebolavirus and Marburg virus genomes have been analyzed. Results showed that there is asymmetric mutational A-pressure in the majority of Zaire ebolavirus genes; there is mutational AC-pressure in the coding region of the matrix protein VP40, probably, caused by its high expression at the end of the infection process; there is also AC-pressure in the 3′-part of the nucleoprotein (NP) coding gene associated with low amount of secondary structure formed by the 3′-part of its mRNA; in the middle of the glycoprotein (GP) coding gene that kind of mutational bias is linked with the high amount of secondary structure formed by the corresponding fragment of RNA negative (−) strand; there is relatively symmetric mutational AU-pressure in the polymerase (Pol) coding gene caused by its low expression level. In Marburg virus all genes, including C-rich fragment of GP coding region, demonstrate asymmetric mutational A-bias, while the last gene (Pol) demonstrates more symmetric mutational AU-pressure. The hypothesis of a newly synthesized RNA negative (−) strand shielding by complementary fragments of mRNAs has been described in this work: shielded fragments of RNA negative (−) strand should be better protected from oxidative damage and prone to ADAR-editing.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/678587","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65091514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming-an Sun, K. Velmurugan, David Keimig, Hehuang Xie
{"title":"HBS-Tools for Hairpin Bisulfite Sequencing Data Processing and Analysis","authors":"Ming-an Sun, K. Velmurugan, David Keimig, Hehuang Xie","doi":"10.1155/2015/760423","DOIUrl":"https://doi.org/10.1155/2015/760423","url":null,"abstract":"The emerging genome-wide hairpin bisulfite sequencing (hairpin-BS-Seq) technique enables the determination of the methylation pattern for DNA double strands simultaneously. Compared with traditional bisulfite sequencing (BS-Seq) techniques, hairpin-BS-Seq can determine methylation fidelity and increase mapping efficiency. However, no computational tool has been designed for the analysis of hairpin-BS-Seq data yet. Here we present HBS-tools, a set of command line based tools for the preprocessing, mapping, methylation calling, and summarizing of genome-wide hairpin-BS-Seq data. It accepts paired-end hairpin-BS-Seq reads to recover the original (pre-bisulfite-converted) sequences using global alignment and then calls the methylation statuses for cytosines on both DNA strands after mapping the original sequences to the reference genome. After applying to hairpin-BS-Seq datasets, we found that HBS-tools have a reduced mapping time and improved mapping efficiency compared with state-of-the-art mapping tools. The HBS-tools source scripts, along with user guide and testing data, are freely available for download.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/760423","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65140493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Developing of the Computer Method for Annotation of Bacterial Genes","authors":"Mikhail A. Golyshev, E. Korotkov","doi":"10.1155/2015/635437","DOIUrl":"https://doi.org/10.1155/2015/635437","url":null,"abstract":"Over the last years a great number of bacterial genomes were sequenced. Now one of the most important challenges of computational genomics is the functional annotation of nucleic acid sequences. In this study we presented the computational method and the annotation system for predicting biological functions using phylogenetic profiles. The phylogenetic profile of a gene was created by way of searching for similarities between the nucleotide sequence of the gene and 1204 reference genomes, with further estimation of the statistical significance of found similarities. The profiles of the genes with known functions were used for prediction of possible functions and functional groups for the new genes. We conducted the functional annotation for genes from 104 bacterial genomes and compared the functions predicted by our system with the already known functions. For the genes that have already been annotated, the known function matched the function we predicted in 63% of the time, and in 86% of the time the known function was found within the top five predicted functions. Besides, our system increased the share of annotated genes by 19%. The developed system may be used as an alternative or complementary system to the current annotation systems.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/635437","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65072059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Oferkin, E. V. Katkova, A. Sulimov, D. Kutov, S. Sobolev, V. Voevodin, V. Sulimov
{"title":"Evaluation of Docking Target Functions by the Comprehensive Investigation of Protein-Ligand Energy Minima","authors":"I. Oferkin, E. V. Katkova, A. Sulimov, D. Kutov, S. Sobolev, V. Voevodin, V. Sulimov","doi":"10.1155/2015/126858","DOIUrl":"https://doi.org/10.1155/2015/126858","url":null,"abstract":"The adequate choice of the docking target function impacts the accuracy of the ligand positioning as well as the accuracy of the protein-ligand binding energy calculation. To evaluate a docking target function we compared positions of its minima with the experimentally known pose of the ligand in the protein active site. We evaluated five docking target functions based on either the MMFF94 force field or the PM7 quantum-chemical method with or without implicit solvent models: PCM, COSMO, and SGB. Each function was tested on the same set of 16 protein-ligand complexes. For exhaustive low-energy minima search the novel MPI parallelized docking program FLM and large supercomputer resources were used. Protein-ligand binding energies calculated using low-energy minima were compared with experimental values. It was demonstrated that the docking target function on the base of the MMFF94 force field in vacuo can be used for discovery of native or near native ligand positions by finding the low-energy local minima spectrum of the target function. The importance of solute-solvent interaction for the correct ligand positioning is demonstrated. It is shown that docking accuracy can be improved by replacement of the MMFF94 force field by the new semiempirical quantum-chemical PM7 method.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/126858","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64794411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In Silico Investigation of Flavonoids as Potential Trypanosomal Nucleoside Hydrolase Inhibitors","authors":"C. Ha, A. Fatima, A. Gaurav","doi":"10.1155/2015/826047","DOIUrl":"https://doi.org/10.1155/2015/826047","url":null,"abstract":"Human African Trypanosomiasis is endemic to 37 countries of sub-Saharan Africa. It is caused by two related species of Trypanosoma brucei. Current therapies suffer from resistance and public accessibility of expensive medicines. Finding safer and effective therapies of natural origin is being extensively explored worldwide. Pentamidine is the only available therapy for inhibiting the P2 adenosine transporter involved in the purine salvage pathway of the trypanosomatids. The objective of the present study is to use computational studies for the investigation of the probable trypanocidal mechanism of flavonoids. Docking experiments were carried out on eight flavonoids of varying level of hydroxylation, namely, flavone, 5-hydroxyflavone, 7-hydroxyflavone, chrysin, apigenin, kaempferol, fisetin, and quercetin. Using AutoDock 4.2, these compounds were tested for their affinity towards inosine-adenosine-guanosine nucleoside hydrolase and the inosine-guanosine nucleoside hydrolase, the major enzymes of the purine salvage pathway. Our results showed that all of the eight tested flavonoids showed high affinities for both hydrolases (lowest free binding energy ranging from −10.23 to −7.14 kcal/mol). These compounds, especially the hydroxylated derivatives, could be further studied as potential inhibitors of the nucleoside hydrolases.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/826047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65170009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-Throughput Quantification of Phenotype Heterogeneity Using Statistical Features","authors":"A. Chaddad, C. Tanougast","doi":"10.1155/2015/728164","DOIUrl":"https://doi.org/10.1155/2015/728164","url":null,"abstract":"Statistical features are widely used in radiology for tumor heterogeneity assessment using magnetic resonance (MR) imaging technique. In this paper, feature selection based on decision tree is examined to determine the relevant subset of glioblastoma (GBM) phenotypes in the statistical domain. To discriminate between active tumor (vAT) and edema/invasion (vE) phenotype, we selected the significant features using analysis of variance (ANOVA) with p value < 0.01. Then, we implemented the decision tree to define the optimal subset features of phenotype classifier. Naïve Bayes (NB), support vector machine (SVM), and decision tree (DT) classifier were considered to evaluate the performance of the feature based scheme in terms of its capability to discriminate vAT from vE. Whole nine features were statistically significant to classify the vAT from vE with p value < 0.01. Feature selection based on decision tree showed the best performance by the comparative study using full feature set. The feature selected showed that the two features Kurtosis and Skewness achieved a highest range value of 58.33-75.00% accuracy classifier and 73.88-92.50% AUC. This study demonstrated the ability of statistical features to provide a quantitative, individualized measurement of glioblastoma patient and assess the phenotype progression.","PeriodicalId":39059,"journal":{"name":"Advances in Bioinformatics","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1155/2015/728164","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65123813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}