Innovations in integrating machine learning and agent-based modeling of biomedical systems

Frontiers in systems biology Pub Date : 2022-06-02 DOI:10.3389/fsysb.2022.959665

N. Sivakumar, C. Mura, S. Peirce

{"title":"Innovations in integrating machine learning and agent-based modeling of biomedical systems","authors":"N. Sivakumar, C. Mura, S. Peirce","doi":"10.3389/fsysb.2022.959665","DOIUrl":null,"url":null,"abstract":"Agent-based modeling (ABM) is a well-established computational paradigm for simulating complex systems in terms of the interactions between individual entities that comprise the system’s population. Machine learning (ML) refers to computational approaches whereby algorithms use statistical methods to “learn” from data on their own, i.e., without imposing any a priori model/theory onto a system or its behavior. Biological systems—ranging from molecules, to cells, to entire organisms, to whole populations and even ecosystems—consist of vast numbers of discrete entities, governed by complex webs of interactions that span various spatiotemporal scales and exhibit nonlinearity, stochasticity, and variable degrees of coupling between entities. For these reasons, the macroscopic properties and collective dynamics of biological systems are generally difficult to accurately model or predict via continuum modeling techniques and mean-field formalisms. ABM takes a “bottom-up” approach that obviates common difficulties of other modeling approaches by enabling one to relatively easily create (or at least propose, for testing) a set of well-defined “rules” to be applied to the individual entities (agents) in a system. Quantitatively evaluating a system and propagating its state over a series of discrete time-steps effectively simulates the system, allowing various observables to be computed and the system’s properties to be analyzed. Because the rules that govern an ABM can be difficult to abstract and formulate from experimental data, at least in an unbiased way, there is a uniquely synergistic opportunity to employ ML to help infer optimal, system-specific ABM rules. Once such rule-sets are devised, running ABM calculations can generate a wealth of data, and ML can be applied in that context too—for example, to generate statistical measures that accurately and meaningfully describe the stochastic outputs of a system and its properties. As an example of synergy in the other direction (from ABM to ML), ABM simulations can generate plausible (realistic) datasets for training ML algorithms (e.g., for regularization, to mitigate overfitting). In these ways, one can envision a variety of synergistic ABM⇄ML loops. After introducing some basic ideas about ABMs and ML, and their limitations, this Review describes examples of how ABM and ML have been integrated in diverse contexts, spanning spatial scales that include multicellular and tissue-scale biology to human population-level epidemiology. In so doing, we have used published studies as a guide to identify ML approaches that are well-suited to particular types of ABM applications, based on the scale of the biological system and the properties of the available data.","PeriodicalId":73109,"journal":{"name":"Frontiers in systems biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in systems biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fsysb.2022.959665","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Agent-based modeling (ABM) is a well-established computational paradigm for simulating complex systems in terms of the interactions between individual entities that comprise the system’s population. Machine learning (ML) refers to computational approaches whereby algorithms use statistical methods to “learn” from data on their own, i.e., without imposing any a priori model/theory onto a system or its behavior. Biological systems—ranging from molecules, to cells, to entire organisms, to whole populations and even ecosystems—consist of vast numbers of discrete entities, governed by complex webs of interactions that span various spatiotemporal scales and exhibit nonlinearity, stochasticity, and variable degrees of coupling between entities. For these reasons, the macroscopic properties and collective dynamics of biological systems are generally difficult to accurately model or predict via continuum modeling techniques and mean-field formalisms. ABM takes a “bottom-up” approach that obviates common difficulties of other modeling approaches by enabling one to relatively easily create (or at least propose, for testing) a set of well-defined “rules” to be applied to the individual entities (agents) in a system. Quantitatively evaluating a system and propagating its state over a series of discrete time-steps effectively simulates the system, allowing various observables to be computed and the system’s properties to be analyzed. Because the rules that govern an ABM can be difficult to abstract and formulate from experimental data, at least in an unbiased way, there is a uniquely synergistic opportunity to employ ML to help infer optimal, system-specific ABM rules. Once such rule-sets are devised, running ABM calculations can generate a wealth of data, and ML can be applied in that context too—for example, to generate statistical measures that accurately and meaningfully describe the stochastic outputs of a system and its properties. As an example of synergy in the other direction (from ABM to ML), ABM simulations can generate plausible (realistic) datasets for training ML algorithms (e.g., for regularization, to mitigate overfitting). In these ways, one can envision a variety of synergistic ABM⇄ML loops. After introducing some basic ideas about ABMs and ML, and their limitations, this Review describes examples of how ABM and ML have been integrated in diverse contexts, spanning spatial scales that include multicellular and tissue-scale biology to human population-level epidemiology. In so doing, we have used published studies as a guide to identify ML approaches that are well-suited to particular types of ABM applications, based on the scale of the biological system and the properties of the available data.

查看原文本刊更多论文

集成机器学习和基于代理的生物医学系统建模的创新

基于代理的建模（ABM）是一种公认的计算范式，用于根据构成系统总体的单个实体之间的交互来模拟复杂系统。机器学习（ML）是指计算方法，算法使用统计方法自行从数据中“学习”，即不将任何先验模型/理论强加给系统或其行为。生物系统——从分子到细胞，从整个生物体到整个种群，甚至生态系统——由大量离散的实体组成，由跨越不同时空尺度的复杂相互作用网络控制，并表现出非线性、随机性和实体之间可变的耦合度。由于这些原因，生物系统的宏观特性和集体动力学通常很难通过连续体建模技术和平均场形式主义来准确建模或预测。ABM采用了一种“自下而上”的方法，通过使人们能够相对容易地创建（或至少提出用于测试）一组定义明确的“规则”来应用于系统中的单个实体（代理），从而避免了其他建模方法的常见困难。量化评估系统并在一系列离散时间步长上传播其状态有效地模拟了系统，允许计算各种可观察性并分析系统的特性。因为管理ABM的规则可能很难从实验数据中抽象和公式化，至少以一种无偏见的方式，所以使用ML来帮助推断最佳的、特定于系统的ABM规则是一个独特的协同机会。一旦设计出这样的规则集，运行ABM计算可以生成丰富的数据，ML也可以应用于这种情况——例如，生成准确而有意义地描述系统及其特性的随机输出的统计度量。作为另一个方向（从ABM到ML）协同作用的例子，ABM模拟可以生成看似合理（现实）的数据集，用于训练ML算法（例如，用于正则化，以减轻过拟合）。通过这些方式，人们可以设想各种协同反导⇄ML循环。在介绍了关于ABM和ML的一些基本思想及其局限性之后，这篇综述描述了ABM和ML如何在不同的背景下整合的例子，涵盖了从多细胞和组织尺度生物学到人类群体水平流行病学的空间尺度。在这样做的过程中，我们使用已发表的研究作为指南，根据生物系统的规模和可用数据的特性，确定非常适合特定类型ABM应用的ML方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in systems biology

自引率

0.00%

发文量