Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin
{"title":"布尔矩阵逻辑编程用于基因组尺度代谢网络模型中基因功能的主动学习","authors":"Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin","doi":"arxiv-2405.06724","DOIUrl":null,"url":null,"abstract":"Techniques to autonomously drive research have been prominent in\nComputational Scientific Discovery, while Synthetic Biology is a field of\nscience that focuses on designing and constructing new biological systems for\nuseful purposes. Here we seek to apply logic-based machine learning techniques\nto facilitate cellular engineering and drive biological discovery.\nComprehensive databases of metabolic processes called genome-scale metabolic\nnetwork models (GEMs) are often used to evaluate cellular engineering\nstrategies to optimise target compound production. However, predicted host\nbehaviours are not always correctly described by GEMs, often due to errors in\nthe models. The task of learning the intricate genetic interactions within GEMs\npresents computational and empirical challenges. To address these, we describe\na novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging\nboolean matrices to evaluate large logic programs. We introduce a new system,\n$BMLP_{active}$, which efficiently explores the genomic hypothesis space by\nguiding informative experimentation through active learning. In contrast to\nsub-symbolic methods, $BMLP_{active}$ encodes a state-of-the-art GEM of a\nwidely accepted bacterial host in an interpretable and logical representation\nusing datalog logic programs. Notably, $BMLP_{active}$ can successfully learn\nthe interaction between a gene pair with fewer training examples than random\nexperimentation, overcoming the increase in experimental design space.\n$BMLP_{active}$ enables rapid optimisation of metabolic models to reliably\nengineer biological systems for producing useful compounds. It offers a\nrealistic approach to creating a self-driving lab for microbial engineering.","PeriodicalId":501325,"journal":{"name":"arXiv - QuanBio - Molecular Networks","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models\",\"authors\":\"Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin\",\"doi\":\"arxiv-2405.06724\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Techniques to autonomously drive research have been prominent in\\nComputational Scientific Discovery, while Synthetic Biology is a field of\\nscience that focuses on designing and constructing new biological systems for\\nuseful purposes. Here we seek to apply logic-based machine learning techniques\\nto facilitate cellular engineering and drive biological discovery.\\nComprehensive databases of metabolic processes called genome-scale metabolic\\nnetwork models (GEMs) are often used to evaluate cellular engineering\\nstrategies to optimise target compound production. However, predicted host\\nbehaviours are not always correctly described by GEMs, often due to errors in\\nthe models. The task of learning the intricate genetic interactions within GEMs\\npresents computational and empirical challenges. To address these, we describe\\na novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging\\nboolean matrices to evaluate large logic programs. We introduce a new system,\\n$BMLP_{active}$, which efficiently explores the genomic hypothesis space by\\nguiding informative experimentation through active learning. In contrast to\\nsub-symbolic methods, $BMLP_{active}$ encodes a state-of-the-art GEM of a\\nwidely accepted bacterial host in an interpretable and logical representation\\nusing datalog logic programs. Notably, $BMLP_{active}$ can successfully learn\\nthe interaction between a gene pair with fewer training examples than random\\nexperimentation, overcoming the increase in experimental design space.\\n$BMLP_{active}$ enables rapid optimisation of metabolic models to reliably\\nengineer biological systems for producing useful compounds. It offers a\\nrealistic approach to creating a self-driving lab for microbial engineering.\",\"PeriodicalId\":501325,\"journal\":{\"name\":\"arXiv - QuanBio - Molecular Networks\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Molecular Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.06724\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Molecular Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.06724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
Techniques to autonomously drive research have been prominent in
Computational Scientific Discovery, while Synthetic Biology is a field of
science that focuses on designing and constructing new biological systems for
useful purposes. Here we seek to apply logic-based machine learning techniques
to facilitate cellular engineering and drive biological discovery.
Comprehensive databases of metabolic processes called genome-scale metabolic
network models (GEMs) are often used to evaluate cellular engineering
strategies to optimise target compound production. However, predicted host
behaviours are not always correctly described by GEMs, often due to errors in
the models. The task of learning the intricate genetic interactions within GEMs
presents computational and empirical challenges. To address these, we describe
a novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging
boolean matrices to evaluate large logic programs. We introduce a new system,
$BMLP_{active}$, which efficiently explores the genomic hypothesis space by
guiding informative experimentation through active learning. In contrast to
sub-symbolic methods, $BMLP_{active}$ encodes a state-of-the-art GEM of a
widely accepted bacterial host in an interpretable and logical representation
using datalog logic programs. Notably, $BMLP_{active}$ can successfully learn
the interaction between a gene pair with fewer training examples than random
experimentation, overcoming the increase in experimental design space.
$BMLP_{active}$ enables rapid optimisation of metabolic models to reliably
engineer biological systems for producing useful compounds. It offers a
realistic approach to creating a self-driving lab for microbial engineering.