{"title":"Generalization power of threshold Boolean networks","authors":"Gonzalo A. Ruz , Anthony D. Cho","doi":"10.1016/j.biosystems.2025.105572","DOIUrl":null,"url":null,"abstract":"<div><div>Threshold Boolean networks are widely used to model gene regulatory systems and social dynamics such as consensus formation. In these networks, each node takes a binary value (0 or 1), leading to an exponential growth in the number of possible configurations with the number of nodes (<span><math><msup><mrow><mn>2</mn></mrow><mrow><mi>n</mi></mrow></msup></math></span>). Inferring such networks involves learning a weight matrix and threshold vector from configuration data. However, in practice, the full state transition matrix is rarely available. This study investigates the generalization power of threshold Boolean networks, specifically, their ability to accurately infer the underlying network as the amount of available training data is reduced or degraded. We conducted experiments to empirically explore this generalization across networks with varying sizes and connectivities, using the perceptron learning algorithm for training. We also examined scenarios where data is degraded and evaluated the networks’ ability to preserve the original system’s fixed points. Our results reveal an inverse relationship between network size and the required portion of the state transition matrix: larger networks require less data to infer the original structure. For example, networks with five nodes required about 62.5% of the data, whereas networks with nine nodes needed only 46%. Conversely, we observed a positive correlation between node indegree and the amount of training data necessary for accurate inference. In terms of preserving fixed points, our findings indicate that using approximately 40% of the data is generally sufficient to retain the fixed points present in the complete dataset.</div></div>","PeriodicalId":50730,"journal":{"name":"Biosystems","volume":"257 ","pages":"Article 105572"},"PeriodicalIF":1.9000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0303264725001820","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Threshold Boolean networks are widely used to model gene regulatory systems and social dynamics such as consensus formation. In these networks, each node takes a binary value (0 or 1), leading to an exponential growth in the number of possible configurations with the number of nodes (). Inferring such networks involves learning a weight matrix and threshold vector from configuration data. However, in practice, the full state transition matrix is rarely available. This study investigates the generalization power of threshold Boolean networks, specifically, their ability to accurately infer the underlying network as the amount of available training data is reduced or degraded. We conducted experiments to empirically explore this generalization across networks with varying sizes and connectivities, using the perceptron learning algorithm for training. We also examined scenarios where data is degraded and evaluated the networks’ ability to preserve the original system’s fixed points. Our results reveal an inverse relationship between network size and the required portion of the state transition matrix: larger networks require less data to infer the original structure. For example, networks with five nodes required about 62.5% of the data, whereas networks with nine nodes needed only 46%. Conversely, we observed a positive correlation between node indegree and the amount of training data necessary for accurate inference. In terms of preserving fixed points, our findings indicate that using approximately 40% of the data is generally sufficient to retain the fixed points present in the complete dataset.
期刊介绍:
BioSystems encourages experimental, computational, and theoretical articles that link biology, evolutionary thinking, and the information processing sciences. The link areas form a circle that encompasses the fundamental nature of biological information processing, computational modeling of complex biological systems, evolutionary models of computation, the application of biological principles to the design of novel computing systems, and the use of biomolecular materials to synthesize artificial systems that capture essential principles of natural biological information processing.