Robust Optimal Metabolic Factories.

IF 1.6 4区生物学 Q4 BIOCHEMICAL RESEARCH METHODS

Journal of Computational Biology Pub Date : 2024-10-01 Epub Date: 2024-09-27 DOI:10.1089/cmb.2024.0748

Spencer Krieger, John Kececioglu

{"title":"Robust Optimal Metabolic Factories.","authors":"Spencer Krieger, John Kececioglu","doi":"10.1089/cmb.2024.0748","DOIUrl":null,"url":null,"abstract":"Perhaps the most fundamental model in synthetic and systems biology for inferring pathways in metabolic reaction networks is a metabolic factory: a system of reactions that starts from a set of source compounds and produces a set of target molecules, while conserving or not depleting intermediate metabolites. Finding a shortest factory-that minimizes a sum of real-valued weights on its reactions to infer the most likely pathway-is NP-complete. The current state-of-the-art for shortest factories solves a mixed-integer linear program with a major drawback: it requires the user to set a critical parameter, where too large a value can make optimal solutions infeasible, while too small a value can yield degenerate solutions due to numerical error. We present the first robust algorithm for optimal factories that is both parameter-free (relieving the user from determining a parameter setting) and degeneracy-free (guaranteeing it finds an optimal nondegenerate solution). We also give for the first time a complete characterization of the graph-theoretic structure of shortest factories, that reveals an important class of degenerate solutions which was overlooked and potentially output by the prior state-of-the-art.We show degeneracy is precisely due to invalid stoichiometries in reactions, and provide an efficient algorithm for identifying all such misannotations in a metabolic network. In addition we settle the relationship between the two established pathway models of hyperpaths and factories by proving hyperpaths actually comprise a subclass of factories. Comprehensive experiments over all instances from the standard metabolic reaction databases in the literature demonstrate our parameter-free exact algorithm is fast in practice, quickly finding optimal factories in large real-world networks containing thousands of reactions. A preliminary implementation of our robust algorithm for shortest factories in a new tool called Freeia is available free for research use at http://freeia.cs.arizona.edu.","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"1045-1086"},"PeriodicalIF":1.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2024.0748","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/27 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Perhaps the most fundamental model in synthetic and systems biology for inferring pathways in metabolic reaction networks is a metabolic factory: a system of reactions that starts from a set of source compounds and produces a set of target molecules, while conserving or not depleting intermediate metabolites. Finding a shortest factory-that minimizes a sum of real-valued weights on its reactions to infer the most likely pathway-is NP-complete. The current state-of-the-art for shortest factories solves a mixed-integer linear program with a major drawback: it requires the user to set a critical parameter, where too large a value can make optimal solutions infeasible, while too small a value can yield degenerate solutions due to numerical error. We present the first robust algorithm for optimal factories that is both parameter-free (relieving the user from determining a parameter setting) and degeneracy-free (guaranteeing it finds an optimal nondegenerate solution). We also give for the first time a complete characterization of the graph-theoretic structure of shortest factories, that reveals an important class of degenerate solutions which was overlooked and potentially output by the prior state-of-the-art.We show degeneracy is precisely due to invalid stoichiometries in reactions, and provide an efficient algorithm for identifying all such misannotations in a metabolic network. In addition we settle the relationship between the two established pathway models of hyperpaths and factories by proving hyperpaths actually comprise a subclass of factories. Comprehensive experiments over all instances from the standard metabolic reaction databases in the literature demonstrate our parameter-free exact algorithm is fast in practice, quickly finding optimal factories in large real-world networks containing thousands of reactions. A preliminary implementation of our robust algorithm for shortest factories in a new tool called Freeia is available free for research use at http://freeia.cs.arizona.edu.

查看原文本刊更多论文

稳健的最佳代谢工厂

在合成生物学和系统生物学中，用于推断代谢反应网络路径的最基本模型可能是代谢工厂：一个从一组源化合物开始并产生一组目标分子的反应系统，同时保留或不耗尽中间代谢产物。寻找一个最短工厂--最小化其反应的实值权重之和，从而推断出最可能的路径--是一个 NP-complete（不完全）问题。目前最短工厂的最先进算法是求解一个混合整数线性程序，但有一个主要缺点：它需要用户设置一个临界参数，参数值太大可能导致最优解不可行，而参数值太小又会因数值误差而产生退化解。我们首次提出了无参数（用户无需确定参数设置）和无退化（保证找到非退化的最优解）的最优工厂稳健算法。我们还首次给出了最短工厂图论结构的完整表征，揭示了一类重要的退化解，而这一类解曾被先前的最先进算法所忽略，并有可能被输出。我们证明了退化正是由于反应中无效的化学计量学造成的，并提供了一种高效算法来识别代谢网络中的所有此类错误注释。此外，我们通过证明超路径实际上包括工厂的一个子类，解决了超路径和工厂这两种已建立的通路模型之间的关系。对文献中标准代谢反应数据库的所有实例进行的综合实验证明，我们的无参数精确算法在实践中速度很快，能在包含数千个反应的大型真实世界网络中快速找到最优工厂。我们在一个名为 Freeia 的新工具中初步实现了最短工厂的稳健算法，该工具可在 http://freeia.cs.arizona.edu 网站上免费供研究人员使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computational Biology 生物-计算机：跨学科应用

CiteScore

3.60

自引率

5.90%

发文量

113

审稿时长

6-12 weeks

期刊介绍： Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics. Journal of Computational Biology coverage includes: -Genomics -Mathematical modeling and simulation -Distributed and parallel biological computing -Designing biological databases -Pattern matching and pattern detection -Linking disparate databases and data -New tools for computational biology -Relational and object-oriented database technology for bioinformatics -Biological expert system design and use -Reasoning by analogy, hypothesis formation, and testing by machine -Management of biological databases