GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION.

IF 1.3 4区 数学 Q2 STATISTICS & PROBABILITY
Annals of Applied Statistics Pub Date : 2019-06-01 Epub Date: 2019-06-17 DOI:10.1214/18-AOAS1213
Andrew McDavid, Raphael Gottardo, Noah Simon, Mathias Drton
{"title":"GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION.","authors":"Andrew McDavid,&nbsp;Raphael Gottardo,&nbsp;Noah Simon,&nbsp;Mathias Drton","doi":"10.1214/18-AOAS1213","DOIUrl":null,"url":null,"abstract":"<p><p>Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional independences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods; or in bulk data sets. An R implementation is available at https://github.com/amcdavid/HurdleNormal.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":"13 2","pages":"848-873"},"PeriodicalIF":1.3000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/18-AOAS1213","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/18-AOAS1213","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/6/17 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 22

Abstract

Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional independences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods; or in bulk data sets. An R implementation is available at https://github.com/amcdavid/HurdleNormal.

Abstract Image

零平面单细胞基因表达的图形模型。
大量基因表达实验依赖于数千个细胞的聚集来测量生物体中的平均表达。微流体和液滴测序的进展现在允许在单细胞中进行表达谱分析。这项对细胞间变异的研究表明,单个细胞缺乏可检测的转录物表达,这些转录物在群体水平上似乎很丰富,从而导致零膨胀的表达模式。为了从这些数据中推断基因协同调节网络,我们提出了一个多元Hurdle模型。它由奇异高斯分布的混合物组成。我们使用具有伪似然的邻域选择和组套索惩罚来选择和拟合捕获基因之间条件独立性的无向图形模型。所提出的方法在模拟中比现有方法更敏感,即使在偏离我们的Hurdle模型的情况下也是如此。该方法应用于T滤泡辅助细胞的数据和小鼠树突状细胞的高维图谱。它推断出其他方法没有揭示的网络结构;或者在大容量数据集中。R实现可在https://github.com/amcdavid/HurdleNormal.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of Applied Statistics
Annals of Applied Statistics 社会科学-统计学与概率论
CiteScore
3.10
自引率
5.60%
发文量
131
审稿时长
6-12 weeks
期刊介绍: Statistical research spans an enormous range from direct subject-matter collaborations to pure mathematical theory. The Annals of Applied Statistics, the newest journal from the IMS, is aimed at papers in the applied half of this range. Published quarterly in both print and electronic form, our goal is to provide a timely and unified forum for all areas of applied statistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信