异构图上的连通性问题。

IF 1.5 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS
Algorithms for Molecular Biology Pub Date : 2019-03-08 eCollection Date: 2019-01-01 DOI:10.1186/s13015-019-0141-z
Jimmy Wu, Alex Khodaverdian, Benjamin Weitz, Nir Yosef
{"title":"异构图上的连通性问题。","authors":"Jimmy Wu,&nbsp;Alex Khodaverdian,&nbsp;Benjamin Weitz,&nbsp;Nir Yosef","doi":"10.1186/s13015-019-0141-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference \"database\" graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu.</p><p><strong>Results and discussion: </strong>We introduce the <i>condition</i> Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for <i>C</i> conditions, this new problem is NP-hard to approximate to a factor of <math><mrow><mi>C</mi> <mo>-</mo> <mi>ϵ</mi></mrow> </math> , for every <math><mrow><mi>C</mi> <mo>≥</mo> <mn>2</mn></mrow> </math> and <math><mrow><mi>ϵ</mi> <mo>></mo> <mn>0</mn></mrow> </math> , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows <i>monotonically</i> between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network.</p><p><strong>Conclusion: </strong>Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting.</p>","PeriodicalId":50823,"journal":{"name":"Algorithms for Molecular Biology","volume":" ","pages":"5"},"PeriodicalIF":1.5000,"publicationDate":"2019-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13015-019-0141-z","citationCount":"0","resultStr":"{\"title\":\"Connectivity problems on heterogeneous graphs.\",\"authors\":\"Jimmy Wu,&nbsp;Alex Khodaverdian,&nbsp;Benjamin Weitz,&nbsp;Nir Yosef\",\"doi\":\"10.1186/s13015-019-0141-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference \\\"database\\\" graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu.</p><p><strong>Results and discussion: </strong>We introduce the <i>condition</i> Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for <i>C</i> conditions, this new problem is NP-hard to approximate to a factor of <math><mrow><mi>C</mi> <mo>-</mo> <mi>ϵ</mi></mrow> </math> , for every <math><mrow><mi>C</mi> <mo>≥</mo> <mn>2</mn></mrow> </math> and <math><mrow><mi>ϵ</mi> <mo>></mo> <mn>0</mn></mrow> </math> , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows <i>monotonically</i> between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network.</p><p><strong>Conclusion: </strong>Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting.</p>\",\"PeriodicalId\":50823,\"journal\":{\"name\":\"Algorithms for Molecular Biology\",\"volume\":\" \",\"pages\":\"5\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2019-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/s13015-019-0141-z\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Algorithms for Molecular Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13015-019-0141-z\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2019/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithms for Molecular Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-019-0141-z","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/1/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

背景:网络连接问题在计算生物学研究中非常多,其中图形被用来表示一系列现象:从分子之间的物理相互作用到更抽象的关系,如基因共表达。研究生物网络的一个常见挑战是需要从潜在相互作用的大型数据库中提取有意义的小子图。对于这个任务,一个有用的抽象是斯坦纳网络问题:给定一个参考“数据库”图,找到一个满足给定连接需求集的简约子图。虽然这个公式在许多实例中被证明是有用的,但下一个挑战是考虑到参考图可能不是静态的这一事实。例如,当研究单细胞或不同时间点的蛋白质测量时,可能会发生这种情况,其中不同的条件子集可能具有不同的蛋白质环境。结果和讨论:我们引入了条件斯坦纳网络问题,其中我们同时考虑了一组不同的生物条件。每个条件都与一组连接性需求以及假定在该条件中存在的一组边相关联。这个问题的目标是找到一个最小子图,该子图通过各自条件中存在的路径满足所有需求。我们表明,引入多个条件作为一个额外的因素使这个问题更难以近似。具体地说,我们证明了对于C条件,对于每一个C≥2且λ > 0,这个新问题是NP-hard近似于C - λ的一个因子,并且这个界是紧的。在最坏的情况下,我们探索了一组特殊的实例,其中参考图在条件之间单调增长,并表明该问题允许大幅度改进的近似算法。我们还为一般问题开发了一个整数线性规划求解器,并通过人类蛋白质相互作用网络的实例证明了它达到最优性的能力。结论:我们的研究结果表明,与计算生物学中研究的大多数连通性问题相比,考虑生物条件的多样性增加了相当大的复杂性,我们建议用一个新的求解器来解决这个问题。重要的是,我们的研究结果扩展到计算生物学中常用的几个网络连接问题,例如奖品收集斯坦纳树,并为它们在多条件设置中的应用提供了理论保证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Connectivity problems on heterogeneous graphs.

Connectivity problems on heterogeneous graphs.

Connectivity problems on heterogeneous graphs.

Connectivity problems on heterogeneous graphs.

Background: Network connectivity problems are abundant in computational biology research, where graphs are used to represent a range of phenomena: from physical interactions between molecules to more abstract relationships such as gene co-expression. One common challenge in studying biological networks is the need to extract meaningful, small subgraphs out of large databases of potential interactions. A useful abstraction for this task turned out to be the Steiner Network problems: given a reference "database" graph, find a parsimonious subgraph that satisfies a given set of connectivity demands. While this formulation proved useful in a number of instances, the next challenge is to account for the fact that the reference graph may not be static. This can happen for instance, when studying protein measurements in single cells or at different time points, whereby different subsets of conditions can have different protein milieu.

Results and discussion: We introduce the condition Steiner Network problem in which we concomitantly consider a set of distinct biological conditions. Each condition is associated with a set of connectivity demands, as well as a set of edges that are assumed to be present in that condition. The goal of this problem is to find a minimal subgraph that satisfies all the demands through paths that are present in the respective condition. We show that introducing multiple conditions as an additional factor makes this problem much harder to approximate. Specifically, we prove that for C conditions, this new problem is NP-hard to approximate to a factor of C - ϵ , for every C 2 and ϵ > 0 , and that this bound is tight. Moving beyond the worst case, we explore a special set of instances where the reference graph grows monotonically between conditions, and show that this problem admits substantially improved approximation algorithms. We also developed an integer linear programming solver for the general problem and demonstrate its ability to reach optimality with instances from the human protein interaction network.

Conclusion: Our results demonstrate that in contrast to most connectivity problems studied in computational biology, accounting for multiplicity of biological conditions adds considerable complexity, which we propose to address with a new solver. Importantly, our results extend to several network connectivity problems that are commonly used in computational biology, such as Prize-Collecting Steiner Tree, and provide insight into the theoretical guarantees for their applications in a multiple condition setting.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Algorithms for Molecular Biology
Algorithms for Molecular Biology 生物-生化研究方法
CiteScore
2.40
自引率
10.00%
发文量
16
审稿时长
>12 weeks
期刊介绍: Algorithms for Molecular Biology publishes articles on novel algorithms for biological sequence and structure analysis, phylogeny reconstruction, and combinatorial algorithms and machine learning. Areas of interest include but are not limited to: algorithms for RNA and protein structure analysis, gene prediction and genome analysis, comparative sequence analysis and alignment, phylogeny, gene expression, machine learning, and combinatorial algorithms. Where appropriate, manuscripts should describe applications to real-world data. However, pure algorithm papers are also welcome if future applications to biological data are to be expected, or if they address complexity or approximation issues of novel computational problems in molecular biology. Articles about novel software tools will be considered for publication if they contain some algorithmically interesting aspects.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信