Semi-nonparametric modeling of topological domain formation from epigenetic data.

IF 1.7 4区生物学 Q4 BIOCHEMICAL RESEARCH METHODS

Algorithms for Molecular Biology Pub Date : 2019-03-05 eCollection Date: 2019-01-01 DOI:10.1186/s13015-019-0142-y

Emre Sefer, Carl Kingsford

{"title":"Semi-nonparametric modeling of topological domain formation from epigenetic data.","authors":"Emre Sefer, Carl Kingsford","doi":"10.1186/s13015-019-0142-y","DOIUrl":null,"url":null,"abstract":"Background: Hi-C experiments capturing the 3D genome architecture have led to the discovery of topologically-associated domains (TADs) that form an important part of the 3D genome organization and appear to play a role in gene regulation and other functions. Several histone modifications have been independently associated with TAD formation, but their combinatorial effects on domain formation remain poorly understood at a global scale.Results: We propose a convex semi-nonparametric approach called nTDP based on Bernstein polynomials to explore the joint effects of histone markers on TAD formation as well as predict TADs solely from the histone data. We find a small subset of modifications to be predictive of TADs across species. By inferring TADs using our trained model, we are able to predict TADs across different species and cell types, without the use of Hi-C data, suggesting their effect is conserved. This work provides the first comprehensive joint model of the effect of histone markers on domain formation.Conclusions: Our approach, nTDP, can form the basis of a unified, explanatory model of the relationship between epigenetic marks and topological domain structures. It can be used to predict domain boundaries for cell types, species, and conditions for which no Hi-C data is available. The model may also be of use for improving Hi-C-based domain finders.","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":" ","pages":"4"},"PeriodicalIF":1.7000,"publicationDate":"2019-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-019-0142-y","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms for Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-019-0142-y","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/1/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 13

Abstract

Background: Hi-C experiments capturing the 3D genome architecture have led to the discovery of topologically-associated domains (TADs) that form an important part of the 3D genome organization and appear to play a role in gene regulation and other functions. Several histone modifications have been independently associated with TAD formation, but their combinatorial effects on domain formation remain poorly understood at a global scale.

Results: We propose a convex semi-nonparametric approach called nTDP based on Bernstein polynomials to explore the joint effects of histone markers on TAD formation as well as predict TADs solely from the histone data. We find a small subset of modifications to be predictive of TADs across species. By inferring TADs using our trained model, we are able to predict TADs across different species and cell types, without the use of Hi-C data, suggesting their effect is conserved. This work provides the first comprehensive joint model of the effect of histone markers on domain formation.

Conclusions: Our approach, nTDP, can form the basis of a unified, explanatory model of the relationship between epigenetic marks and topological domain structures. It can be used to predict domain boundaries for cell types, species, and conditions for which no Hi-C data is available. The model may also be of use for improving Hi-C-based domain finders.

Abstract Image

查看原文本刊更多论文

基于表观遗传数据的拓扑域形成半非参数建模。

背景:捕获三维基因组结构的Hi-C实验导致拓扑相关结构域(TADs)的发现，TADs构成了三维基因组组织的重要组成部分，似乎在基因调控和其他功能中发挥作用。一些组蛋白修饰与TAD形成独立相关，但它们对结构域形成的组合作用在全球范围内仍然知之甚少。结果:我们提出了一种基于Bernstein多项式的凸半非参数方法，称为nTDP，以探索组蛋白标记物对TAD形成的共同影响，并仅从组蛋白数据预测TAD。我们发现一小部分修饰可以预测跨物种的tad。通过使用我们的训练模型推断TADs，我们能够在不使用Hi-C数据的情况下预测不同物种和细胞类型的TADs，这表明它们的效果是保守的。这项工作提供了组蛋白标记物对结构域形成影响的第一个综合联合模型。结论:我们的方法，nTDP，可以形成一个统一的基础，解释表观遗传标记和拓扑结构域结构之间关系的模型。它可用于预测细胞类型、物种和没有Hi-C数据的条件的结构域边界。该模型也可用于改进基于hi - c的域查找器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Algorithms for Molecular Biology 生物-生化研究方法

CiteScore

2.40

自引率

10.00%

发文量

审稿时长

>12 weeks

期刊介绍： Algorithms for Molecular Biology publishes articles on novel algorithms for biological sequence and structure analysis, phylogeny reconstruction, and combinatorial algorithms and machine learning. Areas of interest include but are not limited to: algorithms for RNA and protein structure analysis, gene prediction and genome analysis, comparative sequence analysis and alignment, phylogeny, gene expression, machine learning, and combinatorial algorithms. Where appropriate, manuscripts should describe applications to real-world data. However, pure algorithm papers are also welcome if future applications to biological data are to be expected, or if they address complexity or approximation issues of novel computational problems in molecular biology. Articles about novel software tools will be considered for publication if they contain some algorithmically interesting aspects.