JMLR workshop and conference proceedings最新文献_第2页

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling 确保快速混合和低偏差异步吉布斯采样

JMLR workshop and conference proceedings Pub Date : 2016-02-24 DOI: 10.24963/ijcai.2017/672

Christopher De Sa, Christopher Ré, K. Olukotun

引用次数: 49

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling. 确保快速混合和低偏差异步吉布斯采样。

JMLR workshop and conference proceedings Pub Date : 2016-01-01

Christopher De Sa, Kunle Olukotun, Christopher Ré

引用次数: 0

A Convex Atomic-Norm Approach to Multiple Sequence Alignment and Motif Discovery. 多序列比对与基序发现的凸原子范数方法。

JMLR workshop and conference proceedings Pub Date : 2016-01-01

Ian E H Yen, Xin Lin, Jiong Zhang, Pradeep Ravikumar, Inderjit S Dhillon

引用次数: 0

Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data. 校正单细胞基因表达数据技术变异的Dirichlet过程混合模型。

JMLR workshop and conference proceedings Pub Date : 2016-01-01

Sandhya Prabhakaran, Elham Azizi, Ambrose Carr, Dana Pe'er

{"title":"Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data.","authors":"Sandhya Prabhakaran, Elham Azizi, Ambrose Carr, Dana Pe'er","doi":"","DOIUrl":"","url":null,"abstract":"We introduce an iterative normalization and clustering method for single-cell gene expression data. The emerging technology of single-cell RNA-seq gives access to gene expression measurements for thousands of cells, allowing discovery and characterization of cell types. However, the data is confounded by technical variation emanating from experimental errors and cell type-specific biases. Current approaches perform a global normalization prior to analyzing biological signals, which does not resolve missing data or variation dependent on latent cell types. Our model is formulated as a hierarchical Bayesian mixture model with cell-specific scalings that aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals. We demonstrate that this approach is superior to global normalization followed by clustering. We show identifiability and weak convergence guarantees of our method and present a scalable Gibbs inference algorithm. This method improves cluster inference in both synthetic and real single-cell data compared with previous methods, and allows easy interpretation and recovery of the underlying structure and cell types.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"48 ","pages":"1070-1079"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6004614/pdf/nihms972080.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36243698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Manifold-valued Dirichlet Processes. 流形值Dirichlet过程。

JMLR workshop and conference proceedings Pub Date : 2015-07-01

Hyunwoo J Kim, Jia Xu, Baba C Vemuri, Vikas Singh

{"title":"Manifold-valued Dirichlet Processes.","authors":"Hyunwoo J Kim, Jia Xu, Baba C Vemuri, Vikas Singh","doi":"","DOIUrl":"","url":null,"abstract":"Statistical models for manifold-valued data permit capturing the intrinsic nature of the curved spaces in which the data lie and have been a topic of research for several decades. Typically, these formulations use geodesic curves and distances defined locally for most cases - this makes it hard to design parametric models globally on smooth manifolds. Thus, most (manifold specific) parametric models available today assume that the data lie in a small neighborhood on the manifold. To address this 'locality' problem, we propose a novel nonparametric model which unifies multivariate general linear models (MGLMs) using multiple tangent spaces. Our framework generalizes existing work on (both Euclidean and non-Euclidean) general linear models providing a recipe to globally extend the locally-defined parametric models (using a mixture of local models). By grouping observations into sub-populations at multiple tangent spaces, our method provides insights into the hidden structure (geodesic relationships) in the data. This yields a framework to group observations and discover geodesic relationships between covariates X and manifold-valued responses Y, which we call Dirichlet process mixtures of multivariate general linear models (DP-MGLM) on Riemannian manifolds. Finally, we present proof of concept experiments to validate our model.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"2015 ","pages":"1199-1208"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4783460/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72212239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Estimation of Transition Matrices in High Dimensional Heavy-tailed Vector Autoregressive Processes. 高维重尾向量自回归过程中转移矩阵的鲁棒估计。

JMLR workshop and conference proceedings Pub Date : 2015-07-01

Huitong Qiu, Sheng Xu, Fang Han, Han Liu, Brian Caffo

{"title":"Robust Estimation of Transition Matrices in High Dimensional Heavy-tailed Vector Autoregressive Processes.","authors":"Huitong Qiu, Sheng Xu, Fang Han, Han Liu, Brian Caffo","doi":"","DOIUrl":"","url":null,"abstract":"Gaussian vector autoregressive (VAR) processes have been extensively studied in the literature. However, Gaussian assumptions are stringent for heavy-tailed time series that frequently arises in finance and economics. In this paper, we develop a unified framework for modeling and estimating heavy-tailed VAR processes. In particular, we generalize the Gaussian VAR model by an elliptical VAR model that naturally accommodates heavy-tailed time series. Under this model, we develop a quantile-based robust estimator for the transition matrix of the VAR process. We show that the proposed estimator achieves parametric rates of convergence in high dimensions. This is the first work in analyzing heavy-tailed high dimensional VAR processes. As an application of the proposed framework, we investigate Granger causality in the elliptical VAR process, and show that the robust transition matrix estimator induces sign-consistent estimators of Granger causality. The empirical performance of the proposed methodology is demonstrated by both synthetic and real data. We show that the proposed estimator is robust to heavy tails, and exhibit superior performance in stock price prediction.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"37 ","pages":"1843-1851"},"PeriodicalIF":0.0,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5266499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89720992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiple Testing under Dependence via Semiparametric Graphical Models. 基于半参数图模型的依赖性多重检验。

JMLR workshop and conference proceedings Pub Date : 2014-12-31

Jie Liu, Chunming Zhang, Elizabeth Burnside, David Page

引用次数: 0

Spherical Hamiltonian Monte Carlo for Constrained Target Distributions. 受约束目标分布的球形哈密顿蒙特卡洛。

JMLR workshop and conference proceedings Pub Date : 2014-06-18

Shiwei Lan, Bo Zhou, Babak Shahbaba

{"title":"Spherical Hamiltonian Monte Carlo for Constrained Target Distributions.","authors":"Shiwei Lan, Bo Zhou, Babak Shahbaba","doi":"","DOIUrl":"","url":null,"abstract":"Statistical models with constrained probability distributions are abundant in machine learning. Some examples include regression models with norm constraints (e.g., Lasso), probit models, many copula models, and Latent Dirichlet Allocation (LDA) models. Bayesian inference involving probability distributions confined to constrained domains could be quite challenging for commonly used sampling algorithms. For such problems, we propose a novel Markov Chain Monte Carlo (MCMC) method that provides a general and computationally efficient framework for handling boundary conditions. Our method first maps the D-dimensional constrained domain of parameters to the unit ball [Formula: see text], then augments it to a D-dimensional sphere SD such that the original boundary corresponds to the equator of SD . This way, our method handles the constraints implicitly by moving freely on the sphere generating proposals that remain within boundaries when mapped back to the original space. To improve the computational efficiency of our algorithm, we divide the dynamics into several parts such that the resulting split dynamics has a partial analytical solution as a geodesic flow on the sphere. We apply our method to several examples including truncated Gaussian, Bayesian Lasso, Bayesian bridge regression, and a copula model for identifying synchrony among multiple neurons. Our results show that the proposed method can provide a natural and efficient framework for handling several types of constraints on target distributions.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"32 ","pages":"629-637"},"PeriodicalIF":0.0,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4407381/pdf/nihms672830.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33133055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm. 估计扩散网络结构:恢复条件，样本复杂度和软阈值算法。

JMLR workshop and conference proceedings Pub Date : 2014-06-01

Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schölkopf

{"title":"Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm.","authors":"Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schölkopf","doi":"","DOIUrl":"","url":null,"abstract":"Information spreads across social and technological networks, but often the network structures are hidden from us and we only observe the traces left by the diffusion processes, called cascades. Can we recover the hidden network structures from these observed cascades? What kind of cascades and how many cascades do we need? Are there some network structures which are more difficult than others to recover? Can we design efficient inference algorithms with provable guarantees? Despite the increasing availability of cascade-data and methods for inferring networks from these data, a thorough theoretical understanding of the above questions remains largely unexplored in the literature. In this paper, we investigate the network structure inference problem for a general family of continuous-time diffusion models using an [Formula: see text]-regularized likelihood maximization framework. We show that, as long as the cascade sampling process satisfies a natural incoherence condition, our framework can recover the correct network structure with high probability if we observe O(d3 log N) cascades, where d is the maximum number of parents of a node and N is the total number of nodes. Moreover, we develop a simple and efficient soft-thresholding inference algorithm, which we use to illustrate the consequences of our theoretical results, and show that our framework outperforms other alternatives in practice.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"32 2","pages":"793-801"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4412853/pdf/nihms-680553.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33147202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Influence Function Learning in Information Diffusion Networks. 信息扩散网络中的影响函数学习。

JMLR workshop and conference proceedings Pub Date : 2014-06-01

Nan Du, Yingyu Liang, Maria-Florina Balcan, Le Song

{"title":"Influence Function Learning in Information Diffusion Networks.","authors":"Nan Du, Yingyu Liang, Maria-Florina Balcan, Le Song","doi":"","DOIUrl":"","url":null,"abstract":"Can we learn the influence of a set of people in a social network from cascades of information diffusion? This question is often addressed by a two-stage approach: first learn a diffusion model, and then calculate the influence based on the learned model. Thus, the success of this approach relies heavily on the correctness of the diffusion model which is hard to verify for real world data. In this paper, we exploit the insight that the influence functions in many diffusion models are coverage functions, and propose a novel parameterization of such functions using a convex combination of random basis functions. Moreover, we propose an efficient maximum likelihood based algorithm to learn such functions directly from cascade data, and hence bypass the need to specify a particular diffusion model in advance. We provide both theoretical and empirical analysis for our approach, showing that the proposed approach can provably learn the influence function with low sample complexity, be robust to the unknown diffusion models, and significantly outperform existing approaches in both synthetic and real world data.","PeriodicalId":89793,"journal":{"name":"JMLR workshop and conference proceedings","volume":"32 2","pages":"2016-2024"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4427574/pdf/nihms680551.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33303443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0