{"title":"使用基于加权表达树的可解释机器学习方法,从时间序列生物数据中推断动态模型","authors":"Yu Zhou, Xiufen Zou","doi":"10.1088/1361-6420/ad60f1","DOIUrl":null,"url":null,"abstract":"\n The growing time-series data make it possible to glimpse the hidden dynamics in various fields. However, developing a computational toolbox with high interpretability to unveil the interaction dynamics from data remains a crucial challenge. Here, we propose a new computational approach called Automated Dynamical Model Inference based on Expression Trees (ADMIET), in which the machine learning algorithm, the numerical integration of ordinary differential equations and the interpretability from prior knowledge are embedded into the symbolic learning scheme to establish a general framework for revealing the hidden dynamics in time-series data. ADMIET takes full advantage of both machine learning algorithm and expression tree. Firstly, we translate the prior knowledge into constraints on the structure of expression tree, reducing the search space and enhancing the interpretability. Secondly, we utilize the proposed adaptive penalty function to ensure the convergence of gradient descent algorithm and the selection of the symbols. Compared to gene expression programming, ADMIET exhibits its remarkable capability in function fitting with higher accuracy and broader applicability. Moreover, ADMIET can better fit parameters in nonlinear forms compared to regression methods. Furthermore, we apply ADMIET to two typical biological systems and one real data with different prior knowledge to infer the dynamical equations. The results indicate that ADMIET can not only discover the interaction relationships but also provide accurate estimates of the parameters in the equations. These results demonstrate ADMIET's superiority in revealing interpretable dynamics from time-series biological data.","PeriodicalId":508687,"journal":{"name":"Inverse Problems","volume":"85 25","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inferring dynamical models from time-series biological data using an interpretable machine learning method based on weighted expression trees\",\"authors\":\"Yu Zhou, Xiufen Zou\",\"doi\":\"10.1088/1361-6420/ad60f1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The growing time-series data make it possible to glimpse the hidden dynamics in various fields. However, developing a computational toolbox with high interpretability to unveil the interaction dynamics from data remains a crucial challenge. Here, we propose a new computational approach called Automated Dynamical Model Inference based on Expression Trees (ADMIET), in which the machine learning algorithm, the numerical integration of ordinary differential equations and the interpretability from prior knowledge are embedded into the symbolic learning scheme to establish a general framework for revealing the hidden dynamics in time-series data. ADMIET takes full advantage of both machine learning algorithm and expression tree. Firstly, we translate the prior knowledge into constraints on the structure of expression tree, reducing the search space and enhancing the interpretability. Secondly, we utilize the proposed adaptive penalty function to ensure the convergence of gradient descent algorithm and the selection of the symbols. Compared to gene expression programming, ADMIET exhibits its remarkable capability in function fitting with higher accuracy and broader applicability. Moreover, ADMIET can better fit parameters in nonlinear forms compared to regression methods. Furthermore, we apply ADMIET to two typical biological systems and one real data with different prior knowledge to infer the dynamical equations. The results indicate that ADMIET can not only discover the interaction relationships but also provide accurate estimates of the parameters in the equations. These results demonstrate ADMIET's superiority in revealing interpretable dynamics from time-series biological data.\",\"PeriodicalId\":508687,\"journal\":{\"name\":\"Inverse Problems\",\"volume\":\"85 25\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Inverse Problems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/1361-6420/ad60f1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inverse Problems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/1361-6420/ad60f1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
越来越多的时间序列数据使人们有可能一窥各个领域中隐藏的动态变化。然而,开发一种具有高度可解释性的计算工具箱,以揭示数据中的交互动态,仍然是一项重大挑战。在这里,我们提出了一种名为 "基于表达树的自动动态模型推理"(Automated Dynamical Model Inference based on Expression Trees,ADMIET)的新计算方法,将机器学习算法、常微分方程的数值积分和先验知识的可解释性嵌入到符号学习方案中,从而建立了一个揭示时间序列数据中隐藏动态的通用框架。ADMIET 充分利用了机器学习算法和表达树的优势。首先,我们将先验知识转化为对表达式树结构的约束,从而缩小了搜索空间,提高了可解释性。其次,我们利用提出的自适应惩罚函数来确保梯度下降算法的收敛性和符号的选择。与基因表达式编程相比,ADMIET 在函数拟合方面表现出卓越的能力,具有更高的准确性和更广泛的适用性。此外,与回归方法相比,ADMIET 能更好地拟合非线性形式的参数。此外,我们还将 ADMIET 应用于两个典型的生物系统和一个具有不同先验知识的真实数据,以推断动态方程。结果表明,ADMIET 不仅能发现相互作用关系,还能准确估计方程中的参数。这些结果证明了 ADMIET 在从时间序列生物数据中揭示可解释的动力学方面的优越性。
Inferring dynamical models from time-series biological data using an interpretable machine learning method based on weighted expression trees
The growing time-series data make it possible to glimpse the hidden dynamics in various fields. However, developing a computational toolbox with high interpretability to unveil the interaction dynamics from data remains a crucial challenge. Here, we propose a new computational approach called Automated Dynamical Model Inference based on Expression Trees (ADMIET), in which the machine learning algorithm, the numerical integration of ordinary differential equations and the interpretability from prior knowledge are embedded into the symbolic learning scheme to establish a general framework for revealing the hidden dynamics in time-series data. ADMIET takes full advantage of both machine learning algorithm and expression tree. Firstly, we translate the prior knowledge into constraints on the structure of expression tree, reducing the search space and enhancing the interpretability. Secondly, we utilize the proposed adaptive penalty function to ensure the convergence of gradient descent algorithm and the selection of the symbols. Compared to gene expression programming, ADMIET exhibits its remarkable capability in function fitting with higher accuracy and broader applicability. Moreover, ADMIET can better fit parameters in nonlinear forms compared to regression methods. Furthermore, we apply ADMIET to two typical biological systems and one real data with different prior knowledge to infer the dynamical equations. The results indicate that ADMIET can not only discover the interaction relationships but also provide accurate estimates of the parameters in the equations. These results demonstrate ADMIET's superiority in revealing interpretable dynamics from time-series biological data.