A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type.

IF 0.8 4区 数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY
Lajmi Lakhal-Chaieb, Celia M T Greenwood, Mohamed Ouhourane, Kaiqiong Zhao, Belkacem Abdous, Karim Oualkacha
{"title":"A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type.","authors":"Lajmi Lakhal-Chaieb,&nbsp;Celia M T Greenwood,&nbsp;Mohamed Ouhourane,&nbsp;Kaiqiong Zhao,&nbsp;Belkacem Abdous,&nbsp;Karim Oualkacha","doi":"10.1515/sagmb-2016-0062","DOIUrl":null,"url":null,"abstract":"<p><p>We consider the assessment of DNA methylation profiles for sequencing-derived data from a single cell type or from cell lines. We derive a kernel smoothed EM-algorithm, capable of analyzing an entire chromosome at once, and to simultaneously correct for experimental errors arising from either the pre-treatment steps or from the sequencing stage and to take into account spatial correlations between DNA methylation profiles at neighbouring CpG sites. The outcomes of our algorithm are then used to (i) call the true methylation status at each CpG site, (ii) provide accurate smoothed estimates of DNA methylation levels, and (iii) detect differentially methylated regions. Simulations show that the proposed methodology outperforms existing analysis methods that either ignore the correlation between DNA methylation profiles at neighbouring CpG sites or do not correct for errors. The use of the proposed inference procedure is illustrated through the analysis of a publicly available data set from a cell line of induced pluripotent H9 human embryonic stem cells and also a data set where methylation measures were obtained for a small genomic region in three different immune cell types separated from whole blood.</p>","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":"16 5-6","pages":"333-347"},"PeriodicalIF":0.8000,"publicationDate":"2017-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/sagmb-2016-0062","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2016-0062","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 6

Abstract

We consider the assessment of DNA methylation profiles for sequencing-derived data from a single cell type or from cell lines. We derive a kernel smoothed EM-algorithm, capable of analyzing an entire chromosome at once, and to simultaneously correct for experimental errors arising from either the pre-treatment steps or from the sequencing stage and to take into account spatial correlations between DNA methylation profiles at neighbouring CpG sites. The outcomes of our algorithm are then used to (i) call the true methylation status at each CpG site, (ii) provide accurate smoothed estimates of DNA methylation levels, and (iii) detect differentially methylated regions. Simulations show that the proposed methodology outperforms existing analysis methods that either ignore the correlation between DNA methylation profiles at neighbouring CpG sites or do not correct for errors. The use of the proposed inference procedure is illustrated through the analysis of a publicly available data set from a cell line of induced pluripotent H9 human embryonic stem cells and also a data set where methylation measures were obtained for a small genomic region in three different immune cell types separated from whole blood.

一种平滑的em算法,用于细胞系或单个细胞类型中基于测序的方法的DNA甲基化谱。
我们考虑评估来自单个细胞类型或细胞系的测序衍生数据的DNA甲基化谱。我们推导了一种核平滑em算法,能够一次分析整个染色体,同时纠正由预处理步骤或测序阶段引起的实验错误,并考虑到邻近CpG位点DNA甲基化谱之间的空间相关性。然后,我们的算法结果用于(i)调用每个CpG位点的真实甲基化状态,(ii)提供准确的DNA甲基化水平平滑估计,以及(iii)检测差异甲基化区域。模拟表明,所提出的方法优于现有的分析方法,这些分析方法要么忽略邻近CpG位点DNA甲基化谱之间的相关性,要么不纠正错误。通过对来自诱导多能H9人胚胎干细胞细胞系的公开可用数据集的分析,以及从全血分离的三种不同免疫细胞类型中获得小基因组区域甲基化测量的数据集,说明了所提出的推断程序的使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Statistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-MATHEMATICAL & COMPUTATIONAL BIOLOGY
自引率
11.10%
发文量
8
期刊介绍: Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信