Statistical Applications in Genetics and Molecular Biology最新文献_第4页

A statistical method for analysing cospeciation in tritrophic ecology using electrical circuit theory. 用电路理论分析三养生态共生的统计方法。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-11-27 DOI: 10.1515/sagmb-2016-0049

Colleen Nooney, Stuart Barber, Arief Gusnanto, Walter R Gilks

{"title":"A statistical method for analysing cospeciation in tritrophic ecology using electrical circuit theory.","authors":"Colleen Nooney, Stuart Barber, Arief Gusnanto, Walter R Gilks","doi":"10.1515/sagmb-2016-0049","DOIUrl":"https://doi.org/10.1515/sagmb-2016-0049","url":null,"abstract":"We introduce a new method to test efficiently for cospeciation in tritrophic systems. Our method utilises an analogy with electrical circuit theory to reduce higher order systems into bitrophic data sets that retain the information of the original system. We use a sophisticated permutation scheme that weights interactions between two trophic layers based on their connection to the third layer in the system. Our method has several advantages compared to the method of Mramba et al. [Mramba, L. K., S. Barber, K. Hommola, L. A. Dyer, J. S. Wilson, M. L. Forister and W. R. Gilks (2013): \"Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology,\" Stat. Appl. Genet. Mol. Biol., 12, 679-701.]. We do not require triangular interactions to connect the three phylogenetic trees and an easily interpreted p-value is obtained in one step. Another advantage of our method is the scope for generalisation to higher order systems and phylogenetic networks. The performance of our method is compared to the methods of Hommola et al. [Hommola, K., J. E. Smith, Y. Qiu and W. R. Gilks (2009): \"A permutation test of host-parasite cospeciation,\" Mol. Biol. Evol., 26, 1457-1468.] and Mramba et al. [Mramba, L. K., S. Barber, K. Hommola, L. A. Dyer, J. S. Wilson, M. L. Forister and W. R. Gilks (2013): \"Permutation tests for analyzing cospeciation in multiple phylogenies: applications in tri-trophic ecology,\" Stat. Appl. Genet. Mol. Biol., 12, 679-701.] at the bitrophic and tritrophic level, respectively. This was achieved by evaluating type I error and statistical power. The results show that our method produces unbiased p-values and has comparable power overall at both trophic levels. Our method was successfully applied to a dataset of leaf-mining moths, parasitoid wasps and host plants [Lopez-Vaamonde, C., H. Godfray, S. West, C. Hansson and J. Cook (2005): \"The evolution of host use and unusual reproductive strategies in achrysocharoides parasitoid wasps,\" J. Evol. Biol., 18, 1029-1041.], at both the bitrophic and tritrophic levels.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 5-6","pages":"313-331"},"PeriodicalIF":0.9,"publicationDate":"2017-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35577574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Bayesian estimation of differential transcript usage from RNA-seq data. 基于RNA-seq数据的差异转录物使用的贝叶斯估计。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-11-27 DOI: 10.1515/sagmb-2017-0005

Panagiotis Papastamoulis, Magnus Rattray

{"title":"Bayesian estimation of differential transcript usage from RNA-seq data.","authors":"Panagiotis Papastamoulis, Magnus Rattray","doi":"10.1515/sagmb-2017-0005","DOIUrl":"https://doi.org/10.1515/sagmb-2017-0005","url":null,"abstract":"Next generation sequencing allows the identification of genes consisting of differentially expressed transcripts, a term which usually refers to changes in the overall expression level. A specific type of differential expression is differential transcript usage (DTU) and targets changes in the relative within gene expression of a transcript. The contribution of this paper is to: (a) extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian model which is originally designed for identifying changes in overall expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist model for inferring DTU. cjBitSeq is a read based model and performs fully Bayesian inference by MCMC sampling on the space of latent state of each transcript per gene. BayesDRIMSeq is a count based model and estimates the Bayes Factor of a DTU model against a null model using Laplace's approximation. The proposed models are benchmarked against the existing ones using a recent independent simulation study as well as a real RNA-seq dataset. Our results suggest that the Bayesian methods exhibit similar performance with DRIMSeq in terms of precision/recall but offer better calibration of False Discovery Rate.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 5-6","pages":"367-386"},"PeriodicalIF":0.9,"publicationDate":"2017-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2017-0005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35561338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A statistical test for detecting parent-of-origin effects when parental information is missing. 当父母的信息缺失时，用于检测父母起源效应的统计检验。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-09-26 DOI: 10.1515/sagmb-2017-0007

Chiara Sacco, Cinzia Viroli, Mario Falchi

引用次数: 0

Bayesian comparison of protein structures using partial Procrustes distance. 利用部分Procrustes距离的蛋白质结构贝叶斯比较。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-09-26 DOI: 10.1515/sagmb-2016-0014

Nasim Ejlali, Mohammad Reza Faghihi, Mehdi Sadeghi

引用次数: 2

Confidence intervals for heritability via Haseman-Elston regression. 通过 Haseman-Elston 回归得出遗传率的置信区间。

IF 0.8 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-09-26 DOI: 10.1515/sagmb-2016-0076

Tamar Sofer

{"title":"Confidence intervals for heritability via Haseman-Elston regression.","authors":"Tamar Sofer","doi":"10.1515/sagmb-2016-0076","DOIUrl":"10.1515/sagmb-2016-0076","url":null,"abstract":"Heritability is the proportion of phenotypic variance in a population that is attributable to individual genotypes. Heritability is considered an important measure in both evolutionary biology and in medicine, and is routinely estimated and reported in genetic epidemiology studies. In population-based genome-wide association studies (GWAS), mixed models are used to estimate variance components, from which a heritability estimate is obtained. The estimated heritability is the proportion of the model's total variance that is due to the genetic relatedness matrix (kinship measured from genotypes). Current practice is to use bootstrapping, which is slow, or normal asymptotic approximation to estimate the precision of the heritability estimate; however, this approximation fails to hold near the boundaries of the parameter space or when the sample size is small. In this paper we propose to estimate variance components via a Haseman-Elston regression, find the asymptotic distribution of the variance components and proportions of variance, and use them to construct confidence intervals (CIs). Our method is further developed to obtain unbiased variance components estimators and construct CIs by meta-analyzing information from multiple studies. We demonstrate our approach on data from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 4","pages":"259-273"},"PeriodicalIF":0.8,"publicationDate":"2017-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5857391/pdf/nihms922922.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35318749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FC1000: normalized gene expression changes of systematically perturbed human cells. FC1000:系统扰动人类细胞的归一化基因表达变化。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-09-26 DOI: 10.1515/sagmb-2016-0072

Ingrid M Lönnstedt, Sven Nelander

{"title":"FC1000: normalized gene expression changes of systematically perturbed human cells.","authors":"Ingrid M Lönnstedt, Sven Nelander","doi":"10.1515/sagmb-2016-0072","DOIUrl":"https://doi.org/10.1515/sagmb-2016-0072","url":null,"abstract":"The systematic study of transcriptional responses to genetic and chemical perturbations in human cells is still in its early stages. The largest available dataset to date is the newly released L1000 compendium. With its 1.3 million gene expression profiles of treated human cells it offers many opportunities for biomedical data mining, but also data normalization challenges of new dimensions. We developed a novel and practical approach to obtain accurate estimates of fold change response profiles from L1000, based on the RUV (Remove Unwanted Variation) statistical framework. Extending RUV to a big data setting, we propose an estimation procedure, in which an underlying RUV model is tuned by feedback through dataset specific statistical measures, reflecting p-value distributions and internal gene knockdown controls. Applying these metrics - termed evaluation endpoints - to disjoint data splits and integrating the results to select an optimal normalization, the procedure reduces bias and noise in the L1000 data, which in turn broadens the potential of this resource for pharmacological and functional genomic analyses. Our pipeline and normalization results are distributed as an R package (nelanderlab.org/FC1000.html).","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 4","pages":"217-242"},"PeriodicalIF":0.9,"publicationDate":"2017-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0072","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35318753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Comparing the performance of linear and nonlinear principal components in the context of high-dimensional genomic data integration. 比较高维基因组数据整合中线性和非线性主成分的性能。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-07-26 DOI: 10.1515/sagmb-2016-0066

Shofiqul Islam, Sonia Anand, Jemila Hamid, Lehana Thabane, Joseph Beyene

{"title":"Comparing the performance of linear and nonlinear principal components in the context of high-dimensional genomic data integration.","authors":"Shofiqul Islam, Sonia Anand, Jemila Hamid, Lehana Thabane, Joseph Beyene","doi":"10.1515/sagmb-2016-0066","DOIUrl":"https://doi.org/10.1515/sagmb-2016-0066","url":null,"abstract":"Linear principal component analysis (PCA) is a widely used approach to reduce the dimension of gene or miRNA expression data sets. This method relies on the linearity assumption, which often fails to capture the patterns and relationships inherent in the data. Thus, a nonlinear approach such as kernel PCA might be optimal. We develop a copula-based simulation algorithm that takes into account the degree of dependence and nonlinearity observed in these data sets. Using this algorithm, we conduct an extensive simulation to compare the performance of linear and kernel principal component analysis methods towards data integration and death classification. We also compare these methods using a real data set with gene and miRNA expression of lung cancer patients. First few kernel principal components show poor performance compared to the linear principal components in this occasion. Reducing dimensions using linear PCA and a logistic regression model for classification seems to be adequate for this purpose. Integrating information from multiple data sets using either of these two approaches leads to an improved classification accuracy for the outcome.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 3","pages":"199-216"},"PeriodicalIF":0.9,"publicationDate":"2017-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35184782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genetic association test based on principal component analysis. 基于主成分分析的遗传关联检验。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-07-26 DOI: 10.1515/sagmb-2016-0061

Zhongxue Chen, Shizhong Han, Kai Wang

引用次数: 10

Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study. 稀疏高维多元回归中的正则化估计，并应用于DNA甲基化研究。

IF 0.8 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-07-26 DOI: 10.1515/sagmb-2016-0073

Haixiang Zhang, Yinan Zheng, Grace Yoon, Zhou Zhang, Tao Gao, Brian Joyce, Wei Zhang, Joel Schwartz, Pantel Vokonas, Elena Colicino, Andrea Baccarelli, Lifang Hou, Lei Liu

引用次数: 0

Mixture model-based association analysis with case-control data in genome wide association studies. 全基因组关联研究中基于混合模型的关联分析与病例对照数据。

IF 0.9 4区数学

Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-07-26 DOI: 10.1515/sagmb-2016-0022

Fadhaa Ali, Jian Zhang

{"title":"Mixture model-based association analysis with case-control data in genome wide association studies.","authors":"Fadhaa Ali, Jian Zhang","doi":"10.1515/sagmb-2016-0022","DOIUrl":"https://doi.org/10.1515/sagmb-2016-0022","url":null,"abstract":"Multilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated disease penetrances. A theoretical justification of the above model is provided. Furthermore, we introduce a hypothesis test for haplotype inheritance patterns which underpin this model. The performance of the proposed approach is evaluated by simulations and real data analysis. The results show that the proposed approach outperforms an existing multiple testing method.","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 3","pages":"173-187"},"PeriodicalIF":0.9,"publicationDate":"2017-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35182457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2