Guan-Sheng Liu, R. Ballweg, Alan Ashbaugh, Yin Zhang, Joseph Facciolo, M. Cushion, Tongli Zhang
{"title":"Correction to: A quantitative systems pharmacology (QSP) model for Pneumocystis treatment in mice","authors":"Guan-Sheng Liu, R. Ballweg, Alan Ashbaugh, Yin Zhang, Joseph Facciolo, M. Cushion, Tongli Zhang","doi":"10.1186/s12918-019-0708-9","DOIUrl":"https://doi.org/10.1186/s12918-019-0708-9","url":null,"abstract":"","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0708-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44857679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GNE: a deep learning framework for gene network inference by aggregating biological information.","authors":"Kishan Kc, Rui Li, Feng Cui, Qi Yu, Anne R Haake","doi":"10.1186/s12918-019-0694-y","DOIUrl":"https://doi.org/10.1186/s12918-019-0694-y","url":null,"abstract":"<p><strong>Background: </strong>The topological landscape of gene interaction networks provides a rich source of information for inferring functional patterns of genes or proteins. However, it is still a challenging task to aggregate heterogeneous biological information such as gene expression and gene interactions to achieve more accurate inference for prediction and discovery of new gene interactions. In particular, how to generate a unified vector representation to integrate diverse input data is a key challenge addressed here.</p><p><strong>Results: </strong>We propose a scalable and robust deep learning framework to learn embedded representations to unify known gene interactions and gene expression for gene interaction predictions. These low- dimensional embeddings derive deeper insights into the structure of rapidly accumulating and diverse gene interaction networks and greatly simplify downstream modeling. We compare the predictive power of our deep embeddings to the strong baselines. The results suggest that our deep embeddings achieve significantly more accurate predictions. Moreover, a set of novel gene interaction predictions are validated by up-to-date literature-based database entries.</p><p><strong>Conclusion: </strong>The proposed model demonstrates the importance of integrating heterogeneous information about genes for gene network inference. GNE is freely available under the GNU General Public License and can be downloaded from GitHub ( https://github.com/kckishan/GNE ).</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"38"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0694-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37125552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pritha Dutta, Lichun Ma, Yusuf Ali, Peter M A Sloot, Jie Zheng
{"title":"Boolean network modeling of β-cell apoptosis and insulin resistance in type 2 diabetes mellitus.","authors":"Pritha Dutta, Lichun Ma, Yusuf Ali, Peter M A Sloot, Jie Zheng","doi":"10.1186/s12918-019-0692-0","DOIUrl":"https://doi.org/10.1186/s12918-019-0692-0","url":null,"abstract":"<p><strong>Background: </strong>Major alteration in lifestyle of human population has promoted Type 2 diabetes mellitus (T2DM) to the level of an epidemic. This metabolic disorder is characterized by insulin resistance and pancreatic β-cell dysfunction and apoptosis, triggered by endoplasmic reticulum (ER) stress, oxidative stress and cytokines. Computational modeling is necessary to consolidate information from various sources in order to obtain a comprehensive understanding of the pathogenesis of T2DM and to investigate possible interventions by performing in silico simulations.</p><p><strong>Results: </strong>In this paper, we propose a Boolean network model integrating the insulin resistance pathway with pancreatic β-cell apoptosis pathway which are responsible for T2DM. The model has five input signals, i.e. ER stress, oxidative stress, tumor necrosis factor α (TNF α), Fas ligand (FasL), and interleukin-6 (IL-6). We performed dynamical simulations using random order asynchronous update and with different combinations of the input signals. From the results, we observed that the proposed model made predictions that closely resemble the expression levels of genes in T2DM as reported in the literature.</p><p><strong>Conclusion: </strong>The proposed model can make predictions about expression levels of genes in T2DM that are in concordance with literature. Although experimental validation of the model is beyond the scope of this study, the model can be useful for understanding the aetiology of T2DM and discovery of therapeutic intervention for this prevalent complex disease. The files of our model and results are available at https://github.com/JieZheng-ShanghaiTech/boolean-t2dm .</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"36"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0692-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37124751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data.","authors":"Shiquan Sun, Yabo Chen, Yang Liu, Xuequn Shang","doi":"10.1186/s12918-019-0699-6","DOIUrl":"https://doi.org/10.1186/s12918-019-0699-6","url":null,"abstract":"<p><strong>Background: </strong>Single-cell RNA sequencing (scRNAseq) data always involves various unwanted variables, which would be able to mask the true signal to identify cell-types. More efficient way of dealing with this issue is to extract low dimension information from high dimensional gene expression data to represent cell-type structure. In the past two years, several powerful matrix factorization tools were developed for scRNAseq data, such as NMF, ZIFA, pCMF and ZINB-WaVE. But the existing approaches either are unable to directly model the raw count of scRNAseq data or are really time-consuming when handling a large number of cells (e.g. n>500).</p><p><strong>Results: </strong>In this paper, we developed a fast and efficient count-based matrix factorization method (single-cell negative binomial matrix factorization, scNBMF) based on the TensorFlow framework to infer the low dimensional structure of cell types. To make our method scalable, we conducted a series of experiments on three public scRNAseq data sets, brain, embryonic stem, and pancreatic islet. The experimental results show that scNBMF is more powerful to detect cell types and 10 - 100 folds faster than the scRNAseq bespoke tools.</p><p><strong>Conclusions: </strong>In this paper, we proposed a fast and efficient count-based matrix factorization method, scNBMF, which is more powerful for detecting cell type purposes. A series of experiments were performed on three public scRNAseq data sets. The results show that scNBMF is a more powerful tool in large-scale scRNAseq data analysis. scNBMF was implemented in R and Python, and the source code are freely available at https://github.com/sqsun .</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"28"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0699-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37126448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fusing gene expressions and transitive protein-protein interactions for inference of gene regulatory networks.","authors":"Wenting Liu, Jagath C Rajapakse","doi":"10.1186/s12918-019-0695-x","DOIUrl":"10.1186/s12918-019-0695-x","url":null,"abstract":"<p><strong>Background: </strong>Systematic fusion of multiple data sources for Gene Regulatory Networks (GRN) inference remains a key challenge in systems biology. We incorporate information from protein-protein interaction networks (PPIN) into the process of GRN inference from gene expression (GE) data. However, existing PPIN remain sparse and transitive protein interactions can help predict missing protein interactions. We therefore propose a systematic probabilistic framework on fusing GE data and transitive protein interaction data to coherently build GRN.</p><p><strong>Results: </strong>We use a Gaussian Mixture Model (GMM) to soft-cluster GE data, allowing overlapping cluster memberships. Next, a heuristic method is proposed to extend sparse PPIN by incorporating transitive linkages. We then propose a novel way to score extended protein interactions by combining topological properties of PPIN and correlations of GE. Following this, GE data and extended PPIN are fused using a Gaussian Hidden Markov Model (GHMM) in order to identify gene regulatory pathways and refine interaction scores that are then used to constrain the GRN structure. We employ a Bayesian Gaussian Mixture (BGM) model to refine the GRN derived from GE data by using the structural priors derived from GHMM. Experiments on real yeast regulatory networks demonstrate both the feasibility of the extended PPIN in predicting transitive protein interactions and its effectiveness on improving the coverage and accuracy the proposed method of fusing PPIN and GE to build GRN.</p><p><strong>Conclusion: </strong>The GE and PPIN fusion model outperforms both the state-of-the-art single data source models (CLR, GENIE3, TIGRESS) as well as existing fusion models under various constraints.</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"37"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6449891/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37126452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network-based characterization of drug-protein interaction signatures with a space-efficient approach.","authors":"Yasuo Tabei, Masaaki Kotera, Ryusuke Sawada, Yoshihiro Yamanishi","doi":"10.1186/s12918-019-0691-1","DOIUrl":"https://doi.org/10.1186/s12918-019-0691-1","url":null,"abstract":"<p><strong>Background: </strong>Characterization of drug-protein interaction networks with biological features has recently become challenging in recent pharmaceutical science toward a better understanding of polypharmacology.</p><p><strong>Results: </strong>We present a novel method for systematic analyses of the underlying features characteristic of drug-protein interaction networks, which we call \"drug-protein interaction signatures\" from the integration of large-scale heterogeneous data of drugs and proteins. We develop a new efficient algorithm for extracting informative drug-protein interaction signatures from the integration of large-scale heterogeneous data of drugs and proteins, which is made possible by space-efficient representations for fingerprints of drug-protein pairs and sparsity-induced classifiers.</p><p><strong>Conclusions: </strong>Our method infers a set of drug-protein interaction signatures consisting of the associations between drug chemical substructures, adverse drug reactions, protein domains, biological pathways, and pathway modules. We argue the these signatures are biologically meaningful and useful for predicting unknown drug-protein interactions and are expected to contribute to rational drug design.</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"39"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0691-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37127159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoying Li, Yaping Lin, Changlong Gu, Jialiang Yang
{"title":"FCMDAP: using miRNA family and cluster information to improve the prediction accuracy of disease related miRNAs.","authors":"Xiaoying Li, Yaping Lin, Changlong Gu, Jialiang Yang","doi":"10.1186/s12918-019-0696-9","DOIUrl":"https://doi.org/10.1186/s12918-019-0696-9","url":null,"abstract":"<p><strong>Background: </strong>Biological experiments have confirmed the association between miRNAs and various diseases. However, such experiments are costly and time consuming. Computational methods help select potential disease-related miRNAs to improve the efficiency of biological experiments.</p><p><strong>Methods: </strong>In this work, we develop a novel method using multiple types of data to calculate miRNA and disease similarity based on mutual information, and add miRNA family and cluster information to predict human disease-related miRNAs (FCMDAP). This method not only depends on known miRNA-diseases associations but also accurately measures miRNA and disease similarity and resolves the problem of overestimation. FCMDAP uses the k most similar neighbor recommendation algorithm to predict the association score between miRNA and disease. Information about miRNA cluster is also used to improve prediction accuracy.</p><p><strong>Result: </strong>FCMDAP achieves an average AUC of 0.9165 based on leave-one-out cross validation. Results confirm the 100, 98 and 96% of the top 50 predicted miRNAs reported in case studies on colorectal, lung, and pancreatic neoplasms. FCMDAP also exhibits satisfactory performance in predicting diseases without any related miRNAs and miRNAs without any related diseases.</p><p><strong>Conclusions: </strong>In this study, we present a computational method FCMDAP to improve the prediction accuracy of disease related miRNAs. FCMDAP could be an effective tool for further biological experiments.</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"26"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0696-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37125575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ultrafast clustering of single-cell flow cytometry data using FlowGrid.","authors":"Xiaoxin Ye, Joshua W K Ho","doi":"10.1186/s12918-019-0690-2","DOIUrl":"https://doi.org/10.1186/s12918-019-0690-2","url":null,"abstract":"<p><strong>Background: </strong>Flow cytometry is a popular technology for quantitative single-cell profiling of cell surface markers. It enables expression measurement of tens of cell surface protein markers in millions of single cells. It is a powerful tool for discovering cell sub-populations and quantifying cell population heterogeneity. Traditionally, scientists use manual gating to identify cell types, but the process is subjective and is not effective for large multidimensional data. Many clustering algorithms have been developed to analyse these data but most of them are not scalable to very large data sets with more than ten million cells.</p><p><strong>Results: </strong>Here, we present a new clustering algorithm that combines the advantages of density-based clustering algorithm DBSCAN with the scalability of grid-based clustering. This new clustering algorithm is implemented in python as an open source package, FlowGrid. FlowGrid is memory efficient and scales linearly with respect to the number of cells. We have evaluated the performance of FlowGrid against other state-of-the-art clustering programs and found that FlowGrid produces similar clustering results but with substantially less time. For example, FlowGrid is able to complete a clustering task on a data set of 23.6 million cells in less than 12 seconds, while other algorithms take more than 500 seconds or get into error.</p><p><strong>Conclusions: </strong>FlowGrid is an ultrafast clustering algorithm for large single-cell flow cytometry data. The source code is available at https://github.com/VCCRI/FlowGrid .</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"35"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0690-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37124753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting disease-related phenotypes using an integrated phenotype similarity measurement based on HPO.","authors":"Hansheng Xue, Jiajie Peng, Xuequn Shang","doi":"10.1186/s12918-019-0697-8","DOIUrl":"https://doi.org/10.1186/s12918-019-0697-8","url":null,"abstract":"<p><strong>Background: </strong>Improving efficiency of disease diagnosis based on phenotype ontology is a critical yet challenging research area. Recently, Human Phenotype Ontology (HPO)-based semantic similarity has been affectively and widely used to identify causative genes and diseases. However, current phenotype similarity measurements just consider the annotations and hierarchy structure of HPO, neglecting the definition description of phenotype terms.</p><p><strong>Results: </strong>In this paper, we propose a novel phenotype similarity measurement, termed as DisPheno, which adequately incorporates the definition of phenotype terms in addition to HPO structure and annotations to measure the similarity between phenotype terms. DisPheno also integrates phenotype term associations into phenotype-set similarity measurement using gene and disease annotations of phenotype terms.</p><p><strong>Conclusions: </strong>Compared with five existing state-of-the-art methods, DisPheno shows great performance in HPO-based phenotype semantic similarity measurement and improves the efficiency of disease identification, especially on noisy patients dataset.</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"34"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0697-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37127174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangyang Hao, Quan-Yang Duh, Richard T Kloos, Joshua Babiarz, R Mack Harrell, S Thomas Traweek, Su Yeon Kim, Grazyna Fedorowicz, P Sean Walsh, Peter M Sadow, Jing Huang, Giulia C Kennedy
{"title":"Identification of Hürthle cell cancers: solving a clinical challenge with genomic sequencing and a trio of machine learning algorithms.","authors":"Yangyang Hao, Quan-Yang Duh, Richard T Kloos, Joshua Babiarz, R Mack Harrell, S Thomas Traweek, Su Yeon Kim, Grazyna Fedorowicz, P Sean Walsh, Peter M Sadow, Jing Huang, Giulia C Kennedy","doi":"10.1186/s12918-019-0693-z","DOIUrl":"10.1186/s12918-019-0693-z","url":null,"abstract":"<p><strong>Background: </strong>Identification of Hürthle cell cancers by non-operative fine-needle aspiration biopsy (FNAB) of thyroid nodules is challenging. Resultingly, non-cancerous Hürthle lesions were conventionally distinguished from Hürthle cell cancers by histopathological examination of tissue following surgical resection. Reliance on histopathological evaluation requires patients to undergo surgery to obtain a diagnosis despite most being non-cancerous. It is highly desirable to avoid surgery and to provide accurate classification of benignity versus malignancy from FNAB preoperatively. In our first-generation algorithm, Gene Expression Classifier (GEC), we achieved this goal by using machine learning (ML) on gene expression features. The classifier is sensitive, but not specific due in part to the presence of non-neoplastic benign Hürthle cells in many FNAB.</p><p><strong>Results: </strong>We sought to overcome this low-specificity limitation by expanding the feature set for ML using next-generation whole transcriptome RNA sequencing and called the improved algorithm the Genomic Sequencing Classifier (GSC). The Hürthle identification leverages mitochondrial expression and we developed novel feature extraction mechanisms to measure chromosomal and genomic level loss-of-heterozygosity (LOH) for the algorithm. Additionally, we developed a multi-layered system of cascading classifiers to sequentially triage Hürthle cell-containing FNAB, including: 1. presence of Hürthle cells, 2. presence of neoplastic Hürthle cells, and 3. presence of benign Hürthle cells. The final Hürthle cell Index utilizes 1048 nuclear and mitochondrial genes; and Hürthle cell Neoplasm Index leverages LOH features as well as 2041 genes. Both indices are Support Vector Machine (SVM) based. The third classifier, the GSC Benign/Suspicious classifier, utilizes 1115 core genes and is an ensemble classifier incorporating 12 individual models.</p><p><strong>Conclusions: </strong>The accurate algorithmic depiction of this complex biological system among Hürthle subtypes results in a dramatic improvement of classification performance; specificity among Hürthle cell neoplasms increases from 11.8% with the GEC to 58.8% with the GSC, while maintaining the same sensitivity of 89%.</p>","PeriodicalId":9013,"journal":{"name":"BMC Systems Biology","volume":"13 Suppl 2","pages":"27"},"PeriodicalIF":0.0,"publicationDate":"2019-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12918-019-0693-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37124578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}