Lun Ai, Stephen H. Muggleton, Shi-shun Liang, Geoff S. Baldwin
{"title":"Active learning of digenic functions with boolean matrix logic programming","authors":"Lun Ai, Stephen H. Muggleton, Shi-shun Liang, Geoff S. Baldwin","doi":"arxiv-2408.14487","DOIUrl":null,"url":null,"abstract":"We apply logic-based machine learning techniques to facilitate cellular\nengineering and drive biological discovery, based on comprehensive databases of\nmetabolic processes called genome-scale metabolic network models (GEMs).\nPredicted host behaviours are not always correctly described by GEMs. Learning\nthe intricate genetic interactions within GEMs presents computational and\nempirical challenges. To address these, we describe a novel approach called\nBoolean Matrix Logic Programming (BMLP) by leveraging boolean matrices to\nevaluate large logic programs. We introduce a new system, $BMLP_{active}$,\nwhich efficiently explores the genomic hypothesis space by guiding informative\nexperimentation through active learning. In contrast to sub-symbolic methods,\n$BMLP_{active}$ encodes a state-of-the-art GEM of a widely accepted bacterial\nhost in an interpretable and logical representation using datalog logic\nprograms. Notably, $BMLP_{active}$ can successfully learn the interaction\nbetween a gene pair with fewer training examples than random experimentation,\novercoming the increase in experimental design space. $BMLP_{active}$ enables\nrapid optimisation of metabolic models and offers a realistic approach to a\nself-driving lab for microbial engineering.","PeriodicalId":501325,"journal":{"name":"arXiv - QuanBio - Molecular Networks","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Molecular Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.14487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We apply logic-based machine learning techniques to facilitate cellular
engineering and drive biological discovery, based on comprehensive databases of
metabolic processes called genome-scale metabolic network models (GEMs).
Predicted host behaviours are not always correctly described by GEMs. Learning
the intricate genetic interactions within GEMs presents computational and
empirical challenges. To address these, we describe a novel approach called
Boolean Matrix Logic Programming (BMLP) by leveraging boolean matrices to
evaluate large logic programs. We introduce a new system, $BMLP_{active}$,
which efficiently explores the genomic hypothesis space by guiding informative
experimentation through active learning. In contrast to sub-symbolic methods,
$BMLP_{active}$ encodes a state-of-the-art GEM of a widely accepted bacterial
host in an interpretable and logical representation using datalog logic
programs. Notably, $BMLP_{active}$ can successfully learn the interaction
between a gene pair with fewer training examples than random experimentation,
overcoming the increase in experimental design space. $BMLP_{active}$ enables
rapid optimisation of metabolic models and offers a realistic approach to a
self-driving lab for microbial engineering.