Generalization power of threshold Boolean networks

IF 1.9 4区生物学 Q2 BIOLOGY

Biosystems Pub Date : 2025-08-28 DOI:10.1016/j.biosystems.2025.105572

Gonzalo A. Ruz , Anthony D. Cho

{"title":"Generalization power of threshold Boolean networks","authors":"Gonzalo A. Ruz , Anthony D. Cho","doi":"10.1016/j.biosystems.2025.105572","DOIUrl":null,"url":null,"abstract":"<div><div>Threshold Boolean networks are widely used to model gene regulatory systems and social dynamics such as consensus formation. In these networks, each node takes a binary value (0 or 1), leading to an exponential growth in the number of possible configurations with the number of nodes (<span><math><msup><mrow><mn>2</mn></mrow><mrow><mi>n</mi></mrow></msup></math></span>). Inferring such networks involves learning a weight matrix and threshold vector from configuration data. However, in practice, the full state transition matrix is rarely available. This study investigates the generalization power of threshold Boolean networks, specifically, their ability to accurately infer the underlying network as the amount of available training data is reduced or degraded. We conducted experiments to empirically explore this generalization across networks with varying sizes and connectivities, using the perceptron learning algorithm for training. We also examined scenarios where data is degraded and evaluated the networks’ ability to preserve the original system’s fixed points. Our results reveal an inverse relationship between network size and the required portion of the state transition matrix: larger networks require less data to infer the original structure. For example, networks with five nodes required about 62.5% of the data, whereas networks with nine nodes needed only 46%. Conversely, we observed a positive correlation between node indegree and the amount of training data necessary for accurate inference. In terms of preserving fixed points, our findings indicate that using approximately 40% of the data is generally sufficient to retain the fixed points present in the complete dataset.</div></div>","PeriodicalId":50730,"journal":{"name":"Biosystems","volume":"257 ","pages":"Article 105572"},"PeriodicalIF":1.9000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0303264725001820","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Threshold Boolean networks are widely used to model gene regulatory systems and social dynamics such as consensus formation. In these networks, each node takes a binary value (0 or 1), leading to an exponential growth in the number of possible configurations with the number of nodes (

2^{n}

). Inferring such networks involves learning a weight matrix and threshold vector from configuration data. However, in practice, the full state transition matrix is rarely available. This study investigates the generalization power of threshold Boolean networks, specifically, their ability to accurately infer the underlying network as the amount of available training data is reduced or degraded. We conducted experiments to empirically explore this generalization across networks with varying sizes and connectivities, using the perceptron learning algorithm for training. We also examined scenarios where data is degraded and evaluated the networks’ ability to preserve the original system’s fixed points. Our results reveal an inverse relationship between network size and the required portion of the state transition matrix: larger networks require less data to infer the original structure. For example, networks with five nodes required about 62.5% of the data, whereas networks with nine nodes needed only 46%. Conversely, we observed a positive correlation between node indegree and the amount of training data necessary for accurate inference. In terms of preserving fixed points, our findings indicate that using approximately 40% of the data is generally sufficient to retain the fixed points present in the complete dataset.

查看原文本刊更多论文

阈值布尔网络的泛化能力。

阈值布尔网络被广泛用于模拟基因调控系统和社会动态，如共识形成。在这些网络中，每个节点采用二进制值（0或1），导致可能的配置数量随着节点数量（2n）呈指数增长。推断这样的网络需要从配置数据中学习权重矩阵和阈值向量。然而，在实践中，完整的状态转移矩阵很少可用。本研究探讨了阈值布尔网络的泛化能力，特别是当可用训练数据量减少或退化时，它们准确推断底层网络的能力。我们进行了实验，利用感知器学习算法进行训练，在不同大小和连接的网络上经验地探索这种泛化。我们还检查了数据退化的场景，并评估了网络保留原始系统固定点的能力。我们的结果揭示了网络大小和状态转移矩阵所需部分之间的反比关系：更大的网络需要更少的数据来推断原始结构。例如，5个节点的网络需要大约62.5%的数据，而9个节点的网络只需要46%的数据。相反，我们观察到节点度与准确推理所需的训练数据量之间呈正相关。在保留不动点方面，我们的研究结果表明，使用大约40%的数据通常足以保留完整数据集中存在的不动点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Biosystems 生物-生物学

CiteScore

3.70

自引率

18.80%

发文量

129

审稿时长

34 days

期刊介绍： BioSystems encourages experimental, computational, and theoretical articles that link biology, evolutionary thinking, and the information processing sciences. The link areas form a circle that encompasses the fundamental nature of biological information processing, computational modeling of complex biological systems, evolutionary models of computation, the application of biological principles to the design of novel computing systems, and the use of biomolecular materials to synthesize artificial systems that capture essential principles of natural biological information processing.