Kaushal Desai, David Brott, Xiaohua Hu, Anastasia Christianson
{"title":"Identifying susceptibility networks for drug-induced non-immune neutropenia.","authors":"Kaushal Desai, David Brott, Xiaohua Hu, Anastasia Christianson","doi":"10.1504/ijdmb.2015.066339","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066339","url":null,"abstract":"<p><p>Systems toxicology, a branch of toxicology that studies drug effects at the level of biological systems, offers exciting opportunities to discover toxicity-related sub-networks using high-throughput technologies. This paper takes a computational approach to systems toxicology and investigates the use of automated signalling path detection for discovery of potential biomarkers of drug-induced non-immune neutropenia. The algorithm utilises a gene expression change measure to mine a large protein interaction network and identify chemical-toxicity signalling paths. Cytoscape-based analysis of detected signalling paths with statistically significant path expression scores reveals 'hub' proteins and a smaller sub-network of path proteins. The importance of 'hub' and drug-toxicity signalling path proteins in haematological and apoptotic signal transduction networks is investigated in order to understand the value of automated signalling path detection approach.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066339","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sigrun Helga Lund, Asgeir Sigurdsson, Sigurjon Axel Gudjonsson, Julius Gudmundsson, Daniel Fannar Gudbjartsson, Thorunn Rafnar, Kari Stefansson, Gunnar Stefansson
{"title":"The effect of SNPs on expression levels in Nimblegen RNA expression microarrays.","authors":"Sigrun Helga Lund, Asgeir Sigurdsson, Sigurjon Axel Gudjonsson, Julius Gudmundsson, Daniel Fannar Gudbjartsson, Thorunn Rafnar, Kari Stefansson, Gunnar Stefansson","doi":"10.1504/ijdmb.2015.068949","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.068949","url":null,"abstract":"<p><p>In this paper the effect of SNPs on expression levels in Nimblegen RNA expression microarrays is investigated. A vast number of replicates of probe pairs representing both alleles of SNPs on 14 loci allows accurate estimation of the difference in signal intensities both within and between probe pairs. The majority of probe-pairs with sufficiently high expression have significant differences in expression levels within the pair and the difference shows concordance with the genotype of the samples. With two or more replicates of each probe, the allele-to-allele variance dominates the error in estimating the difference within the probe-pair, ten replicates are needed for adequate power in calling a true difference within a single probe-pair. Using the expression level of the probe within the probe-pair that has the higher value gives more accurate estimates. When using probes at loci containing known SNP's one should use probes containing both alleles of the SNP.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.068949","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34106976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new algorithm for essential proteins identification based on the integration of protein complex co-expression information and edge clustering coefficient.","authors":"Jiawei Luo, Juan Wu","doi":"10.1504/ijdmb.2015.069654","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069654","url":null,"abstract":"<p><p>Essential proteins provide valuable information for the development of biology and medical research from the system level. The accuracy of topological centrality only based methods is deeply affected by noise in the network. Therefore, exploring efficient methods for identifying essential proteins would be of great value. Using biological features to identify essential proteins is efficient in reducing the noise in PPI network. In this paper, based on the consideration that essential proteins evolve slowly and play a central role within a network, a new algorithm, named CED, is proposed. CED mainly employs gene expression level, protein complex information and edge clustering coefficient to predict essential proteins. The performance of CED is validated based on the yeast Protein-Protein Interaction (PPI) network obtained from DIP database and BioGRID database. The prediction accuracy of CED outperforms other seven algorithms when applied to the two databases.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069654","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34125293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimum Bayesian error probability-based gene subset selection.","authors":"Jian Li, Tian Yu, Jin-Mao Wei","doi":"10.1504/ijdmb.2015.070056","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.070056","url":null,"abstract":"<p><p>Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction.","authors":"Chun Fang, Tamotsu Noguchi, Hayato Yamana","doi":"10.1504/ijdmb.2015.068954","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.068954","url":null,"abstract":"<p><p>Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligand-binding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other feature-based methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.068954","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34276060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic extraction of reference gene from literature in plants based on texting mining.","authors":"Lin He, Gengyu Shen, Fei Li, Shuiqing Huang","doi":"10.1504/ijdmb.2015.070063","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.070063","url":null,"abstract":"<p><p>Real-Time Quantitative Polymerase Chain Reaction (qRT-PCR) is widely used in biological research. It is a key to the availability of qRT-PCR experiment to select a stable reference gene. However, selecting an appropriate reference gene usually requires strict biological experiment for verification with high cost in the process of selection. Scientific literatures have accumulated a lot of achievements on the selection of reference gene. Therefore, mining reference genes under specific experiment environments from literatures can provide quite reliable reference genes for similar qRT-PCR experiments with the advantages of reliability, economic and efficiency. An auxiliary reference gene discovery method from literature is proposed in this paper which integrated machine learning, natural language processing and text mining approaches. The validity tests showed that this new method has a better precision and recall on the extraction of reference genes and their environments.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070063","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andres Espindola, William Schneider, Peter R Hoyt, Stephen M Marek, Carla Garzon
{"title":"A new approach for detecting fungal and oomycete plant pathogens in next generation sequencing metagenome data utilising electronic probes.","authors":"Andres Espindola, William Schneider, Peter R Hoyt, Stephen M Marek, Carla Garzon","doi":"10.1504/ijdmb.2015.069422","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069422","url":null,"abstract":"<p><p>Early stage infections caused by fungal/oomycete spores may not be detected until signs or symptoms develop. Serological and molecular techniques are currently used for detecting these pathogens. Next-generation sequencing (NGS) has potential as a diagnostic tool, due to the capacity to target multiple unique signature loci of pathogens in an infected plant metagenome. NGS has significant potential for diagnosis of important eukaryotic plant pathogens. However, the assembly and analysis of huge amounts of sequence is laborious, time consuming, and not necessary for diagnostic purposes. Previous work demonstrated that a bioinformatic tool termed Electronic probe Diagnostic Nucleic acid Analysis (EDNA) had potential for greatly simplifying detecting fungal and oomycete plant pathogens in simulated metagenomes. The initial study demonstrated limitations for detection accuracy related to the analysis of matches between queries and metagenome reads. This study is a modification of EDNA demonstrating a better accuracy for detecting fungal and oomycete plant pathogens.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069422","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CORE: core-based synthetic minority over-sampling and borderline majority under-sampling technique.","authors":"Chumphol Bunkhumpornpat, Krung Sinapiromsaran","doi":"10.1504/ijdmb.2015.068952","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.068952","url":null,"abstract":"<p><p>Class imbalance learning has recently drawn considerable attention among researchers. In this area, a rare class is the class of primary interest from the aim of classification. Unfortunately, traditional machine learning algorithms fail to detect this class because a huge majority class overwhelms a tiny minority class. In this paper, we propose a new technique called CORE to handle the class imbalance problem. The objective of CORE is to strengthen the core of a minority class and weaken the risk of misclassified minority instances nearby the borderline of a majority class. These core and borderline regions are defined by the applicability of a safe level. As a result, a minority class is more crowed and dominant. The experiment shows that CORE can significantly improve the predictive performance of a minority class when its dataset is imbalance.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.068952","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34276056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative study on network motif discovery algorithms.","authors":"Yusuf Kavurucu","doi":"10.1504/ijdmb.2015.066777","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066777","url":null,"abstract":"<p><p>Subgraphs that occur in complex networks with significantly higher frequency than those in randomised networks are called network motifs. Such subgraphs often play important roles in the functioning of those networks. Finding network motifs is a computationally challenging problem. The main difficulties arise from the fact that real networks are large and the size of the search space grows exponentially with increasing network and motif size. Numerous methods have been developed to overcome these challenges. This paper provides a comparative study of the key network motif discovery algorithms in the literature and presents their algorithmic details on an example network.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066777","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33906548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meta-learning framework applied in bioinformatics inference system design.","authors":"Tomás Arredondo, Wladimir Ormazábal","doi":"10.1504/ijdmb.2015.066775","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066775","url":null,"abstract":"<p><p>This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066775","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}