Statistical Applications in Genetics and Molecular Biology最新文献_第4页

Optimizing weighted gene co-expression network analysis with a multi-threaded calculation of the topological overlap matrix. 基于拓扑重叠矩阵多线程计算的优化加权基因共表达网络分析。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2021-11-09 DOI: 10.1515/sagmb-2021-0025

Min Shuai, Dongmei He, Xin Chen

{"title":"Optimizing weighted gene co-expression network analysis with a multi-threaded calculation of the topological overlap matrix.","authors":"Min Shuai, Dongmei He, Xin Chen","doi":"10.1515/sagmb-2021-0025","DOIUrl":"https://doi.org/10.1515/sagmb-2021-0025","url":null,"abstract":"Biomolecular networks are often assumed to be scale-free hierarchical networks. The weighted gene co-expression network analysis (WGCNA) treats gene co-expression networks as undirected scale-free hierarchical weighted networks. The WGCNA R software package uses an Adjacency Matrix to store a network, next calculates the topological overlap matrix (TOM), and then identifies the modules (sub-networks), where each module is assumed to be associated with a certain biological function. The most time-consuming step of WGCNA is to calculate TOM from the Adjacency Matrix in a single thread. In this paper, the single-threaded algorithm of the TOM has been changed into a multi-threaded algorithm (the parameters are the default values of WGCNA). In the multi-threaded algorithm, Rcpp was used to make R call a C++ function, and then C++ used OpenMP to start multiple threads to calculate TOM from the Adjacency Matrix. On shared-memory MultiProcessor systems, the calculation time decreases as the number of CPU cores increases. The algorithm of this paper can promote the application of WGCNA on large data sets, and help other research fields to identify sub-networks in undirected scale-free hierarchical weighted networks. The source codes and usage are available at https://github.com/do-somethings-haha/multi-threaded_calculate_unsigned_TOM_from_unsigned_or_signed_Adjacency_Matrix_of_WGCNA.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"20 4-6","pages":"145-153"},"PeriodicalIF":0.9,"publicationDate":"2021-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39696432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A hierarchical Bayesian approach for detecting global microbiome associations. 检测全球微生物组关联的分层贝叶斯方法。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2021-11-01 DOI: 10.1515/sagmb-2021-0047

Farhad Hatami, Emma Beamish, Albert Davies, Rachael Rigby, Frank Dondelinger

{"title":"A hierarchical Bayesian approach for detecting global microbiome associations.","authors":"Farhad Hatami, Emma Beamish, Albert Davies, Rachael Rigby, Frank Dondelinger","doi":"10.1515/sagmb-2021-0047","DOIUrl":"10.1515/sagmb-2021-0047","url":null,"abstract":"The human gut microbiome has been shown to be associated with a variety of human diseases, including cancer, metabolic conditions and inflammatory bowel disease. Current approaches for detecting microbiome associations are limited by relying on specific measures of ecological distance, or only allowing for the detection of associations with individual bacterial species, rather than the whole microbiome. In this work, we develop a novel hierarchical Bayesian model for detecting global microbiome associations. Our method is not dependent on a choice of distance measure, and is able to incorporate phylogenetic information about microbial species. We perform extensive simulation studies and show that our method allows for consistent estimation of global microbiome effects. Additionally, we investigate the performance of the model on two real-world microbiome studies: a study of microbiome-metabolome associations in inflammatory bowel disease, and a study of associations between diet and the gut microbiome in mice. We show that we can use the method to reliably detect associations in real-world datasets with varying numbers of samples and covariates.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"20 3","pages":"85-100"},"PeriodicalIF":0.9,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9125803/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39574241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Frontmatter

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2021-10-01 DOI: 10.1515/sagmb-2021-frontmatter3

引用次数: 0

Frontmatter Frontmatter

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2021-02-01 DOI: 10.1515/sagmb-2021-frontmatter1

引用次数: 0

Measuring evolutionary cancer dynamics from genome sequencing, one patient at a time 通过基因组测序测量癌症的进化动态，每次一名患者

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-12-01 DOI: 10.1515/sagmb-2020-0075

G. Caravagna

引用次数: 1

Inferring dynamic gene regulatory networks with low-order conditional independencies – an evaluation of the method 推断具有低阶条件独立性的动态基因调控网络-对该方法的评价

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-12-01 DOI: 10.1515/sagmb-2020-0051

Hamda Ajmal, M. G. Madden

{"title":"Inferring dynamic gene regulatory networks with low-order conditional independencies – an evaluation of the method","authors":"Hamda Ajmal, M. G. Madden","doi":"10.1515/sagmb-2020-0051","DOIUrl":"https://doi.org/10.1515/sagmb-2020-0051","url":null,"abstract":"Abstract Over a decade ago, Lèbre (2009) proposed an inference method, G1DBN, to learn the structure of gene regulatory networks (GRNs) from high dimensional, sparse time-series gene expression data. Their approach is based on concept of low-order conditional independence graphs that they extend to dynamic Bayesian networks (DBNs). They present results to demonstrate that their method yields better structural accuracy compared to the related Lasso and Shrinkage methods, particularly where the data is sparse, that is, the number of time measurements n is much smaller than the number of genes p. This paper challenges these claims using a careful experimental analysis, to show that the GRNs reverse engineered from time-series data using the G1DBN approach are less accurate than claimed by Lèbre (2009). We also show that the Lasso method yields higher structural accuracy for graphs learned from the simulated data, compared to the G1DBN method, particularly when the data is sparse ( n < < p $n{< }{< }p$ ). The Lasso method is also better than G1DBN at identifying the transcription factors (TFs) involved in the cell cycle of Saccharomyces cerevisiae.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":" ","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2020-0051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46568594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Distinct characteristics of correlation analysis at the single-cell and the population level 单细胞水平和群体水平相关分析的显著特征

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-08-19 DOI: 10.21203/rs.3.rs-42825/v1

Guoyu Wu, Yuchao Li

{"title":"Distinct characteristics of correlation analysis at the single-cell and the population level","authors":"Guoyu Wu, Yuchao Li","doi":"10.21203/rs.3.rs-42825/v1","DOIUrl":"https://doi.org/10.21203/rs.3.rs-42825/v1","url":null,"abstract":"Abstract Correlation analysis is widely used in biological studies to infer molecular relationships within biological networks. Recently, single-cell analysis has drawn tremendous interests, for its ability to obtain high-resolution molecular phenotypes. It turns out that there is little overlap of co-expressed genes identified in single-cell level investigations with that of population level investigations. However, the nature of the relationship of correlations between single-cell and population levels remains unclear. In this manuscript, we aimed to unveil the origin of the differences between the correlation coefficients at the single-cell level and that at the population level, and bridge the gap between them. Through developing formulations to link correlations at the single-cell and the population level, we illustrated that aggregated correlations could be stronger, weaker or equal to the corresponding individual correlations, depending on the variations and the correlations within the population. When the correlation within the population is weaker than the individual correlation, the aggregated correlation is stronger than the corresponding individual correlation. Besides, our data indicated that aggregated correlation is more likely to be stronger than the corresponding individual correlation, and it was rare to find gene-pairs exclusively strongly correlated at the single-cell level. Through a bottom-up approach to model interactions between molecules in a signaling cascade or a multi-regulator-controlled gene expression, we surprisingly found that the existence of interaction between two components could not be excluded simply based on their low correlation coefficients, suggesting a reconsideration of connectivity within biological networks which was derived solely from correlation analysis. We also investigated the impact of technical random measurement errors on the correlation coefficients for the single-cell level and the population level. The results indicate that the aggregated correlation is relatively robust and less affected. Because of the heterogeneity among single cells, correlation coefficients calculated based on data of the single-cell level might be different from that of the population level. Depending on the specific question we are asking, proper sampling and normalization procedure should be done before we draw any conclusions.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"0 1","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41900756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accuracy and sensitivity of different Bayesian methods for genomic prediction using simulation and real data. 不同贝叶斯方法在基因组预测中的准确性和敏感性。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-08-10 DOI: 10.1515/sagmb-2019-0007

Saheb Foroutaifar

{"title":"Accuracy and sensitivity of different Bayesian methods for genomic prediction using simulation and real data.","authors":"Saheb Foroutaifar","doi":"10.1515/sagmb-2019-0007","DOIUrl":"https://doi.org/10.1515/sagmb-2019-0007","url":null,"abstract":"The main objectives of this study were to compare the prediction accuracy of different Bayesian methods for traits with a wide range of genetic architecture using simulation and real data and to assess the sensitivity of these methods to the violation of their assumptions. For the simulation study, different scenarios were implemented based on two traits with low or high heritability and different numbers of QTL and the distribution of their effects. For real data analysis, a German Holstein dataset for milk fat percentage, milk yield, and somatic cell score was used. The simulation results showed that, with the exception of the Bayes R, the other methods were sensitive to changes in the number of QTLs and distribution of QTL effects. Having a distribution of QTL effects, similar to what different Bayesian methods assume for estimating marker effects, did not improve their prediction accuracy. The Bayes B method gave higher or equal accuracy rather than the rest. The real data analysis showed that similar to scenarios with a large number of QTLs in the simulation, there was no difference between the accuracies of the different methods for any of the traits.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"19 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2019-0007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38247369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Understanding hormonal crosstalk in Arabidopsis root development via emulation and history matching. 通过模拟和历史匹配了解拟南芥根系发育中的激素串扰。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-07-13 DOI: 10.1515/sagmb-2018-0053

Samuel E Jackson, Ian Vernon, Junli Liu, Keith Lindsey

{"title":"Understanding hormonal crosstalk in Arabidopsis root development via emulation and history matching.","authors":"Samuel E Jackson, Ian Vernon, Junli Liu, Keith Lindsey","doi":"10.1515/sagmb-2018-0053","DOIUrl":"https://doi.org/10.1515/sagmb-2018-0053","url":null,"abstract":"A major challenge in plant developmental biology is to understand how plant growth is coordinated by interacting hormones and genes. To meet this challenge, it is important to not only use experimental data, but also formulate a mathematical model. For the mathematical model to best describe the true biological system, it is necessary to understand the parameter space of the model, along with the links between the model, the parameter space and experimental observations. We develop sequential history matching methodology, using Bayesian emulation, to gain substantial insight into biological model parameter spaces. This is achieved by finding sets of acceptable parameters in accordance with successive sets of physical observations. These methods are then applied to a complex hormonal crosstalk model for Arabidopsis root growth. In this application, we demonstrate how an initial set of 22 observed trends reduce the volume of the set of acceptable inputs to a proportion of 6.1 × 10-7 of the original space. Additional sets of biologically relevant experimental data, each of size 5, reduce the size of this space by a further three and two orders of magnitude respectively. Hence, we provide insight into the constraints placed upon the model structure by, and the biological consequences of, measuring subsets of observations.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"19 2","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2018-0053","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38140980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Bivariate traits association analysis using generalized estimating equations in family data. 基于广义估计方程的家庭数据双变量性状关联分析。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2020-05-05 DOI: 10.1515/sagmb-2019-0030

Mariza de Andrade, Mauricio A Mazo Lopera, Nubia E Duarte

{"title":"Bivariate traits association analysis using generalized estimating equations in family data.","authors":"Mariza de Andrade, Mauricio A Mazo Lopera, Nubia E Duarte","doi":"10.1515/sagmb-2019-0030","DOIUrl":"https://doi.org/10.1515/sagmb-2019-0030","url":null,"abstract":"Genome wide association study (GWAS) is becoming fundamental in the arduous task of deciphering the etiology of complex diseases. The majority of the statistical models used to address the genes-disease association consider a single response variable. However, it is common for certain diseases to have correlated phenotypes such as in cardiovascular diseases. Usually, GWAS typically sample unrelated individuals from a population and the shared familial risk factors are not investigated. In this paper, we propose to apply a bivariate model using family data that associates two phenotypes with a genetic region. Using generalized estimation equations (GEE), we model two phenotypes, either discrete, continuous or a mixture of them, as a function of genetic variables and other important covariates. We incorporate the kinship relationships into the working matrix extended to a bivariate analysis. The estimation method and the joint gene-set effect in both phenotypes are developed in this work. We also evaluate the proposed methodology with a simulation study and an application to real data.","PeriodicalId":49477,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"19 2","pages":""},"PeriodicalIF":0.9,"publicationDate":"2020-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2019-0030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37905663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0