MINN: A metabolic-informed neural network for integrating omics data into genome-scale metabolic modeling.

IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Computational and structural biotechnology journal Pub Date : 2025-08-07 eCollection Date: 2025-01-01 DOI:10.1016/j.csbj.2025.08.004
Gabriele Tazza, Francesco Moro, Dario Ruggeri, Bas Teusink, László Vidács
{"title":"MINN: A metabolic-informed neural network for integrating omics data into genome-scale metabolic modeling.","authors":"Gabriele Tazza, Francesco Moro, Dario Ruggeri, Bas Teusink, László Vidács","doi":"10.1016/j.csbj.2025.08.004","DOIUrl":null,"url":null,"abstract":"<p><p>The understanding of cellular behavior relies on the integration of metabolism and its regulation. Multi-omics data provide a detailed snapshot of the molecular processes underpinning cellular functions and their regulation, describing the current state of the cell. While Machine Learning (ML) models can uncover complex patterns and relationships within these data, they require large datasets for training and often lack interpretability. On the other hand, mathematical models, such as Genome-Scale Metabolic Models (GEMs), offer a structured framework for analyzing the organization and dynamics of specific cellular mechanisms. At the same time, they don't allow for seamless integration of omics information. Recently, a new framework to embed GEMs in a neural network has been introduced: these hybrid models combine the strengths of mechanistic and data-driven approaches, offering a promising platform for integrating different data sources with mechanistic knowledge. In this study, we present a Metabolic-Informed Neural Network (MINN) that utilizes multi-omics data to predict metabolic fluxes in <i>Escherichia coli</i>, under different growth rates and gene knockouts. We test its performances against pure ML and parsimonious Flux Balance Analysis (pFBA), demonstrating its efficacy in improving prediction performances. We also highlight how conflicts can emerge between the data-driven and the mechanistic objectives, and we propose different solutions to mitigate them. Finally, we illustrate a strategy to couple the MINN with pFBA, enhancing the interpretability of the solution.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3609-3617"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12359237/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.csbj.2025.08.004","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The understanding of cellular behavior relies on the integration of metabolism and its regulation. Multi-omics data provide a detailed snapshot of the molecular processes underpinning cellular functions and their regulation, describing the current state of the cell. While Machine Learning (ML) models can uncover complex patterns and relationships within these data, they require large datasets for training and often lack interpretability. On the other hand, mathematical models, such as Genome-Scale Metabolic Models (GEMs), offer a structured framework for analyzing the organization and dynamics of specific cellular mechanisms. At the same time, they don't allow for seamless integration of omics information. Recently, a new framework to embed GEMs in a neural network has been introduced: these hybrid models combine the strengths of mechanistic and data-driven approaches, offering a promising platform for integrating different data sources with mechanistic knowledge. In this study, we present a Metabolic-Informed Neural Network (MINN) that utilizes multi-omics data to predict metabolic fluxes in Escherichia coli, under different growth rates and gene knockouts. We test its performances against pure ML and parsimonious Flux Balance Analysis (pFBA), demonstrating its efficacy in improving prediction performances. We also highlight how conflicts can emerge between the data-driven and the mechanistic objectives, and we propose different solutions to mitigate them. Finally, we illustrate a strategy to couple the MINN with pFBA, enhancing the interpretability of the solution.

Abstract Image

Abstract Image

Abstract Image

一个代谢信息神经网络,用于将组学数据整合到基因组尺度的代谢模型中。
对细胞行为的理解依赖于代谢及其调控的整合。多组学数据提供了支撑细胞功能及其调控的分子过程的详细快照,描述了细胞的当前状态。虽然机器学习(ML)模型可以揭示这些数据中的复杂模式和关系,但它们需要大型数据集进行训练,并且通常缺乏可解释性。另一方面,数学模型,如基因组尺度代谢模型(GEMs),为分析特定细胞机制的组织和动力学提供了结构化的框架。同时,它们不允许组学信息的无缝集成。最近,引入了一种将gem嵌入神经网络的新框架:这些混合模型结合了机制和数据驱动方法的优势,为将不同的数据源与机制知识集成提供了一个有前途的平台。在这项研究中,我们提出了一个代谢信息神经网络(MINN),利用多组学数据来预测大肠杆菌在不同生长速度和基因敲除下的代谢通量。我们将其与纯ML和简约通量平衡分析(pFBA)进行了性能测试,证明了其在提高预测性能方面的有效性。我们还强调了数据驱动目标和机制目标之间的冲突是如何出现的,并提出了不同的解决方案来缓解它们。最后,我们阐述了一种将MINN与pFBA相结合的策略,以增强解决方案的可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational and structural biotechnology journal
Computational and structural biotechnology journal Biochemistry, Genetics and Molecular Biology-Biophysics
CiteScore
9.30
自引率
3.30%
发文量
540
审稿时长
6 weeks
期刊介绍: Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to: Structure and function of proteins, nucleic acids and other macromolecules Structure and function of multi-component complexes Protein folding, processing and degradation Enzymology Computational and structural studies of plant systems Microbial Informatics Genomics Proteomics Metabolomics Algorithms and Hypothesis in Bioinformatics Mathematical and Theoretical Biology Computational Chemistry and Drug Discovery Microscopy and Molecular Imaging Nanotechnology Systems and Synthetic Biology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信