基于LTER北亚得里亚海项目的欧盟数据的北亚得里亚海Kauzal生态模型

Pub Date : 2022-11-17 DOI:10.15255/kui.2022.033
Želimir Kurtanjek
{"title":"基于LTER北亚得里亚海项目的欧盟数据的北亚得里亚海Kauzal生态模型","authors":"Želimir Kurtanjek","doi":"10.15255/kui.2022.033","DOIUrl":null,"url":null,"abstract":"The aim of this work was to show possibilities of applied artificial intelligence methodologies and structural causal modelling (“Structural Causal Model”, SCM) with the object of gaining a scientific level contribution to the determination of functional causal dependencies in complex ecological systems. In this work, applied was SCM for the determination of dependencies of chlo rophyll concentration on physical and chemical parameters in the northern Adriatic Sea during the period 1965 to 2015. The experimental data are the outcome of the long-term and extensive investigation as a part of the EU project “LTER Northern Adriatic Sea”, and are freely available within the EU Open Science policy. The data are a “Big Data” base with 108 687 samples and 43 descriptors. Proposed is a mathematical model with Bayes network (BN) as a directed acy - clic graph (DAG). The model structure was determined by the Hamilton-Schmidt conditional independence test with a significance level of α = 0.05. The SCM model shows that the direct causal variables for chlorophyll concentration are: temperature, salinity, pH, and concentrations of nitrogen, phosphor, and silica. The BN model was adjusted according to d-separation with the objective to block confounding and contra-causal back door interference. The functions of causal dependencies were determined as the marginal distributions with Bayes network models with a single interior layer for interpolation. The most important causal effect was due to temperature (−0.07 μg chlorophyll A/°C). The model predicted reversed positive causality between chloro phyll concentration and dissolved oxygen (0.2 mg DO 2 /μg chlorophyll A). Also evaluated was nonparametric comparative analysis of chlorophyll and abiotic parameters between Croatian and northern Adriatic Sea (Slovenia and Italy). The comparison was based on median metrics to avoid the pronounced influence of outliers due to hydrodynamic effects. The median concentration of dissolved oxygen in Croatian Adriatic was 5.8 mg O 2 /l, while in Slovenian and Italian 5.5 mg O 2 /l, and the median temperature was T = 14.6 °C compared to T = 15.1 °C. There is a significant difference in the abundance of dinoflagellates in Croatia 3 cell/l, while in Slovenia and Italian 5 cells/l. The difference is more pronounced by the number and values of “hot spots” outliers. The difference between chlorophyll concentrations is not significant (0.65 and 0.90 μg l −1 ); however, the difference in the distribution of the outliers is significant with more frequent and bigger outliers in Italian and Slovenian Adriatic. Also observed was a significant difference in SiO 4 distribution, with higher concentrations in the western Adriatic. The random forest RF decision tree models are applied for the development of the predictive models of biological parameters based on abiotic data. The RF models are validated by 5-fold cross-validation. The models have out-of-box mean relative errors of 6.5 % for chlorophyll, photopigment 17.4 %; diatoms 18.8 %; dinoflagellate 17.4 %; and 12.1 % for coccolithophores. For each predictive model determined are the first five most important predictors accounting for 95 % of importance.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Kauzalni ekološki model sjevernog Jadrana temeljem podataka EU projekta “LTER Northern Adriatic Sea”\",\"authors\":\"Želimir Kurtanjek\",\"doi\":\"10.15255/kui.2022.033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of this work was to show possibilities of applied artificial intelligence methodologies and structural causal modelling (“Structural Causal Model”, SCM) with the object of gaining a scientific level contribution to the determination of functional causal dependencies in complex ecological systems. In this work, applied was SCM for the determination of dependencies of chlo rophyll concentration on physical and chemical parameters in the northern Adriatic Sea during the period 1965 to 2015. The experimental data are the outcome of the long-term and extensive investigation as a part of the EU project “LTER Northern Adriatic Sea”, and are freely available within the EU Open Science policy. The data are a “Big Data” base with 108 687 samples and 43 descriptors. Proposed is a mathematical model with Bayes network (BN) as a directed acy - clic graph (DAG). The model structure was determined by the Hamilton-Schmidt conditional independence test with a significance level of α = 0.05. The SCM model shows that the direct causal variables for chlorophyll concentration are: temperature, salinity, pH, and concentrations of nitrogen, phosphor, and silica. The BN model was adjusted according to d-separation with the objective to block confounding and contra-causal back door interference. The functions of causal dependencies were determined as the marginal distributions with Bayes network models with a single interior layer for interpolation. The most important causal effect was due to temperature (−0.07 μg chlorophyll A/°C). The model predicted reversed positive causality between chloro phyll concentration and dissolved oxygen (0.2 mg DO 2 /μg chlorophyll A). Also evaluated was nonparametric comparative analysis of chlorophyll and abiotic parameters between Croatian and northern Adriatic Sea (Slovenia and Italy). The comparison was based on median metrics to avoid the pronounced influence of outliers due to hydrodynamic effects. The median concentration of dissolved oxygen in Croatian Adriatic was 5.8 mg O 2 /l, while in Slovenian and Italian 5.5 mg O 2 /l, and the median temperature was T = 14.6 °C compared to T = 15.1 °C. There is a significant difference in the abundance of dinoflagellates in Croatia 3 cell/l, while in Slovenia and Italian 5 cells/l. The difference is more pronounced by the number and values of “hot spots” outliers. The difference between chlorophyll concentrations is not significant (0.65 and 0.90 μg l −1 ); however, the difference in the distribution of the outliers is significant with more frequent and bigger outliers in Italian and Slovenian Adriatic. Also observed was a significant difference in SiO 4 distribution, with higher concentrations in the western Adriatic. The random forest RF decision tree models are applied for the development of the predictive models of biological parameters based on abiotic data. The RF models are validated by 5-fold cross-validation. The models have out-of-box mean relative errors of 6.5 % for chlorophyll, photopigment 17.4 %; diatoms 18.8 %; dinoflagellate 17.4 %; and 12.1 % for coccolithophores. For each predictive model determined are the first five most important predictors accounting for 95 % of importance.\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2022-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15255/kui.2022.033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15255/kui.2022.033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

这项工作的目的是展示应用人工智能方法和结构因果模型(“结构因果模型”,SCM)的可能性,目的是为确定复杂生态系统中的功能因果依赖关系做出科学水平的贡献。在这项工作中,应用SCM来确定1965年至2015年期间亚得里亚海北部叶绿素浓度对物理和化学参数的依赖性。实验数据是作为欧盟项目“LTER北亚得里亚海”的一部分进行的长期广泛调查的结果,在欧盟开放科学政策范围内免费提供。这些数据是一个“大数据”库,包含108687个样本和43个描述符。提出了一种将贝叶斯网络(BN)作为有向算术图(DAG)的数学模型。模型结构通过汉密尔顿·施密特条件独立性检验确定,显著性水平为α=0.05。SCM模型表明,叶绿素浓度的直接因果变量是:温度、盐度、pH以及氮、磷和二氧化硅的浓度。BN模型根据d-separation进行调整,目的是阻断混杂因素和对抗因果后门干扰。因果依赖函数被确定为边际分布,贝叶斯网络模型具有用于插值的单个内层。最重要的因果效应是由于温度(−0.07μg叶绿素A/°C)。该模型预测了叶绿素浓度与溶解氧(0.2 mg DO 2/μg叶绿素A)之间的反向正因果关系。还评估了克罗地亚和亚得里亚海北部(斯洛文尼亚和意大利)叶绿素和非生物参数的非参数比较分析。该比较基于中值指标,以避免由于流体动力学效应而引起的异常值的显著影响。克罗地亚亚得里亚海溶解氧的中位浓度为5.8 mg O2/l,而斯洛文尼亚和意大利为5.5 mg O2/l,中位温度为T=14.6°C,而T=15.1°C。克罗地亚的甲藻丰度为3细胞/l,而斯洛文尼亚和意大利的甲藻数量为5细胞/l。差异在“热点”异常值的数量和值上更为明显。叶绿素浓度之间的差异不显著(0.65和0.90μg l−1);然而,异常值的分布差异很大,意大利和斯洛文尼亚亚得里亚海的异常值更频繁、更大。还观察到SiO4分布的显著差异,在亚得里亚海西部的浓度更高。随机森林RF决策树模型用于开发基于非生物数据的生物参数预测模型。射频模型通过5倍交叉验证进行验证。该模型对叶绿素和光色素的开箱平均相对误差分别为6.5%和17.4%;硅藻18.8%;甲藻17.4%;球石藻为12.1%。对于每个确定的预测模型,前五个最重要的预测因子占95%的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
分享
查看原文
Kauzalni ekološki model sjevernog Jadrana temeljem podataka EU projekta “LTER Northern Adriatic Sea”
The aim of this work was to show possibilities of applied artificial intelligence methodologies and structural causal modelling (“Structural Causal Model”, SCM) with the object of gaining a scientific level contribution to the determination of functional causal dependencies in complex ecological systems. In this work, applied was SCM for the determination of dependencies of chlo rophyll concentration on physical and chemical parameters in the northern Adriatic Sea during the period 1965 to 2015. The experimental data are the outcome of the long-term and extensive investigation as a part of the EU project “LTER Northern Adriatic Sea”, and are freely available within the EU Open Science policy. The data are a “Big Data” base with 108 687 samples and 43 descriptors. Proposed is a mathematical model with Bayes network (BN) as a directed acy - clic graph (DAG). The model structure was determined by the Hamilton-Schmidt conditional independence test with a significance level of α = 0.05. The SCM model shows that the direct causal variables for chlorophyll concentration are: temperature, salinity, pH, and concentrations of nitrogen, phosphor, and silica. The BN model was adjusted according to d-separation with the objective to block confounding and contra-causal back door interference. The functions of causal dependencies were determined as the marginal distributions with Bayes network models with a single interior layer for interpolation. The most important causal effect was due to temperature (−0.07 μg chlorophyll A/°C). The model predicted reversed positive causality between chloro phyll concentration and dissolved oxygen (0.2 mg DO 2 /μg chlorophyll A). Also evaluated was nonparametric comparative analysis of chlorophyll and abiotic parameters between Croatian and northern Adriatic Sea (Slovenia and Italy). The comparison was based on median metrics to avoid the pronounced influence of outliers due to hydrodynamic effects. The median concentration of dissolved oxygen in Croatian Adriatic was 5.8 mg O 2 /l, while in Slovenian and Italian 5.5 mg O 2 /l, and the median temperature was T = 14.6 °C compared to T = 15.1 °C. There is a significant difference in the abundance of dinoflagellates in Croatia 3 cell/l, while in Slovenia and Italian 5 cells/l. The difference is more pronounced by the number and values of “hot spots” outliers. The difference between chlorophyll concentrations is not significant (0.65 and 0.90 μg l −1 ); however, the difference in the distribution of the outliers is significant with more frequent and bigger outliers in Italian and Slovenian Adriatic. Also observed was a significant difference in SiO 4 distribution, with higher concentrations in the western Adriatic. The random forest RF decision tree models are applied for the development of the predictive models of biological parameters based on abiotic data. The RF models are validated by 5-fold cross-validation. The models have out-of-box mean relative errors of 6.5 % for chlorophyll, photopigment 17.4 %; diatoms 18.8 %; dinoflagellate 17.4 %; and 12.1 % for coccolithophores. For each predictive model determined are the first five most important predictors accounting for 95 % of importance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信