基于随机森林和最大熵模型的云南钉螺潜在地理分布预测

Q3 Medicine
Z Zhang, C Du, Y Zhang, H Wang, J Song, J Zhou, L Wang, J Sun, M Shen, C Chen, H Jiang, J Yan, X Feng, W Wang, P Qian, J Xue, S Li, Y Dong
{"title":"基于随机森林和最大熵模型的云南钉螺潜在地理分布预测","authors":"Z Zhang, C Du, Y Zhang, H Wang, J Song, J Zhou, L Wang, J Sun, M Shen, C Chen, H Jiang, J Yan, X Feng, W Wang, P Qian, J Xue, S Li, Y Dong","doi":"10.16250/j.32.1915.2024136","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To predict the potential geographic distribution of <i>Oncomelania hupensis</i> in Yunnan Province using random forest (RF) and maximum entropy (MaxEnt) models, so as to provide insights into <i>O. hupensis</i> surveillance and control in Yunnan Province.</p><p><strong>Methods: </strong>The <i>O. hupensis</i> snail survey data in Yunnan Province from 2015 to 2016 were collected and converted into <i>O. hupensis</i> snail distribution site data. Data of 22 environmental variables in Yunnan Province were collected, including twelve climate variables (annual potential evapotranspiration, annual mean ground surface temperature, annual precipitation, annual mean air pressure, annual mean relative humidity, annual sunshine duration, annual mean air temperature, annual mean wind speed, ≥ 0 ℃ annual accumulated temperature, ≥ 10 ℃ annual accumulated temperature, aridity and index of moisture), eight geographical variables (normalized difference vegetation index, landform type, land use type, altitude, soil type, soil textureclay content, soil texture-sand content and soil texture-silt content) and two population and economic variables (gross domestic product and population). Variables were screened with Pearson correlation test and variance inflation factor (VIF) test. The RF and MaxEnt models and the ensemble model were created using the biomod2 package of the software R 4.2.1, and the potential distribution of <i>O. hupensis</i> snails after 2016 was predicted in Yunnan Province. The predictive effects of models were evaluated through cross-validation and independent tests, and the area under the receiver operating characteristic curve (AUC), true skill statistics (TSS) and <i>Kappa</i> statistics were used for model evaluation. In addition, the importance of environmental variables was analyzed, the contribution of environmental variables output by the models with AUC values of > 0.950 and TSS values of > 0.850 were selected for normalization processing, and the importance percentage of environmental variables was obtained to analyze the importance of environmental variables.</p><p><strong>Results: </strong>Data of 148 <i>O. hupensis</i> snail distribution sites and 15 environmental variables were included in training sets of RF and MaxEnt models, and both RF and MaxEnt models had high predictive performance, with both mean AUC values of > 0.900 and all mean TSS values and <i>Kappa</i> values of > 0.800, and significant differences in the AUC (<i>t</i> = 19.862, <i>P</i> < 0.05), TSS (<i>t</i> = 10.140, <i>P</i> < 0.05) and <i>Kappa</i> values (<i>t</i> = 10.237, <i>P</i> < 0.05) between two models. The AUC, TSS and <i>Kappa</i> values of the ensemble model were 0.996, 0.954 and 0.920, respectively. Independent data verification showed that the AUC, TSS and <i>Kappa</i> values of the RF model and the ensemble model were all 1, which still showed high performance in unknown data modeling, and the MaxEnt model showed poor performance, with TSS and <i>Kappa</i> values of 0 for 24%(24/100) of the modeling results. The modeling results of 79 RF models, 38 MaxEnt models and their ensemble models with AUC values of > 0.950 and TSS values of > 0.850 were included in the evaluation of importance of environmental variables. The importance of annual sunshine duration (SSD) was 32.989%, 37.847% and 46.315% in the RF model, the MaxEnt model and their ensemble model, while the importance of annual mean relative humidity (RHU) was 30.947%, 15.921% and 28.121%, respectively. Important environment variables were concentrated in modeling results of the RF model, dispersed in modeling results of the MaxEnt model, and most concentrated in modeling results of the ensemble model. The potential distribution of <i>O. hupensis</i> snails after 2016 was predicted to be relatively concentrated in Yunnan Province by the RF model and relatively large by the MaxEnt model, and the distribution of <i>O. hupensis</i> snails predicted by the ensemble model was mostly the joint distribution of <i>O. hupensis</i> snails predicted by RF and MaxEnt models.</p><p><strong>Conclusions: </strong>Both RF and MaxEnt models are effective to predict the potential distribution of <i>O. hupensis</i> snails in Yunnan Province, which facilitates targeted <i>O. hupensis</i> snail control.</p>","PeriodicalId":38874,"journal":{"name":"中国血吸虫病防治杂志","volume":"36 6","pages":"562-571"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Prediction of potential geographic distribution of <i>Oncomelania hupensis</i> in Yunnan Province using random forest and maximum entropy models].\",\"authors\":\"Z Zhang, C Du, Y Zhang, H Wang, J Song, J Zhou, L Wang, J Sun, M Shen, C Chen, H Jiang, J Yan, X Feng, W Wang, P Qian, J Xue, S Li, Y Dong\",\"doi\":\"10.16250/j.32.1915.2024136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To predict the potential geographic distribution of <i>Oncomelania hupensis</i> in Yunnan Province using random forest (RF) and maximum entropy (MaxEnt) models, so as to provide insights into <i>O. hupensis</i> surveillance and control in Yunnan Province.</p><p><strong>Methods: </strong>The <i>O. hupensis</i> snail survey data in Yunnan Province from 2015 to 2016 were collected and converted into <i>O. hupensis</i> snail distribution site data. Data of 22 environmental variables in Yunnan Province were collected, including twelve climate variables (annual potential evapotranspiration, annual mean ground surface temperature, annual precipitation, annual mean air pressure, annual mean relative humidity, annual sunshine duration, annual mean air temperature, annual mean wind speed, ≥ 0 ℃ annual accumulated temperature, ≥ 10 ℃ annual accumulated temperature, aridity and index of moisture), eight geographical variables (normalized difference vegetation index, landform type, land use type, altitude, soil type, soil textureclay content, soil texture-sand content and soil texture-silt content) and two population and economic variables (gross domestic product and population). Variables were screened with Pearson correlation test and variance inflation factor (VIF) test. The RF and MaxEnt models and the ensemble model were created using the biomod2 package of the software R 4.2.1, and the potential distribution of <i>O. hupensis</i> snails after 2016 was predicted in Yunnan Province. The predictive effects of models were evaluated through cross-validation and independent tests, and the area under the receiver operating characteristic curve (AUC), true skill statistics (TSS) and <i>Kappa</i> statistics were used for model evaluation. In addition, the importance of environmental variables was analyzed, the contribution of environmental variables output by the models with AUC values of > 0.950 and TSS values of > 0.850 were selected for normalization processing, and the importance percentage of environmental variables was obtained to analyze the importance of environmental variables.</p><p><strong>Results: </strong>Data of 148 <i>O. hupensis</i> snail distribution sites and 15 environmental variables were included in training sets of RF and MaxEnt models, and both RF and MaxEnt models had high predictive performance, with both mean AUC values of > 0.900 and all mean TSS values and <i>Kappa</i> values of > 0.800, and significant differences in the AUC (<i>t</i> = 19.862, <i>P</i> < 0.05), TSS (<i>t</i> = 10.140, <i>P</i> < 0.05) and <i>Kappa</i> values (<i>t</i> = 10.237, <i>P</i> < 0.05) between two models. The AUC, TSS and <i>Kappa</i> values of the ensemble model were 0.996, 0.954 and 0.920, respectively. Independent data verification showed that the AUC, TSS and <i>Kappa</i> values of the RF model and the ensemble model were all 1, which still showed high performance in unknown data modeling, and the MaxEnt model showed poor performance, with TSS and <i>Kappa</i> values of 0 for 24%(24/100) of the modeling results. The modeling results of 79 RF models, 38 MaxEnt models and their ensemble models with AUC values of > 0.950 and TSS values of > 0.850 were included in the evaluation of importance of environmental variables. The importance of annual sunshine duration (SSD) was 32.989%, 37.847% and 46.315% in the RF model, the MaxEnt model and their ensemble model, while the importance of annual mean relative humidity (RHU) was 30.947%, 15.921% and 28.121%, respectively. Important environment variables were concentrated in modeling results of the RF model, dispersed in modeling results of the MaxEnt model, and most concentrated in modeling results of the ensemble model. The potential distribution of <i>O. hupensis</i> snails after 2016 was predicted to be relatively concentrated in Yunnan Province by the RF model and relatively large by the MaxEnt model, and the distribution of <i>O. hupensis</i> snails predicted by the ensemble model was mostly the joint distribution of <i>O. hupensis</i> snails predicted by RF and MaxEnt models.</p><p><strong>Conclusions: </strong>Both RF and MaxEnt models are effective to predict the potential distribution of <i>O. hupensis</i> snails in Yunnan Province, which facilitates targeted <i>O. hupensis</i> snail control.</p>\",\"PeriodicalId\":38874,\"journal\":{\"name\":\"中国血吸虫病防治杂志\",\"volume\":\"36 6\",\"pages\":\"562-571\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"中国血吸虫病防治杂志\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.16250/j.32.1915.2024136\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"中国血吸虫病防治杂志","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.16250/j.32.1915.2024136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

目的:利用随机森林(RF)和最大熵(MaxEnt)模型预测云南省可能发生的湖北钉螺地理分布,为云南省监测和防治提供依据。方法:收集云南省2015 - 2016年血吸虫调查资料,转化为血吸虫分布点资料。本文收集了云南省22个环境变量的数据,包括12个气候变量(年潜在蒸散量、年平均地表温度、年降水量、年平均气压、年平均相对湿度、年日照时数、年平均气温、年平均风速、≥0℃年积温、≥10℃年积温、干旱性和湿度指数)。8个地理变量(归一化差异植被指数、地貌类型、土地利用类型、海拔高度、土壤类型、土壤质地—粘土含量、土壤质地—砂含量和土壤质地—粉含量)和2个人口和经济变量(国内生产总值和人口)。采用Pearson相关检验和方差膨胀因子(variance inflation factor, VIF)检验筛选变量。利用r4.2.1软件中的biomod2软件包建立了RF和MaxEnt模型以及集合模型,预测了2016年后云南省湖北钉螺的潜在分布。通过交叉验证和独立检验评价模型的预测效果,采用受试者工作特征曲线下面积(AUC)、真技能统计量(TSS)和Kappa统计量进行模型评价。此外,对环境变量的重要性进行分析,选取AUC值为> 0.950、TSS值为> 0.850的模型对环境变量输出的贡献进行归一化处理,得到环境变量的重要性百分比,分析环境变量的重要性。结果:数据148 o . hupensis蜗牛分布站点和15个环境变量包含在训练集的射频和MaxEnt模型,射频和MaxEnt模型预测性能很高,平均AUC值> 0.900和所有的意思是TSS值和Kappa值> 0.800,和AUC显著差异(t = 19.862, P < 0.05), TSS (t = 10.140, P < 0.05)和k值(t = 10.237, P < 0.05)两个模型之间的关系。集合模型的AUC、TSS和Kappa值分别为0.996、0.954和0.920。独立数据验证表明,RF模型和集成模型的AUC、TSS和Kappa值均为1,在未知数据建模中仍然表现出较高的性能,而MaxEnt模型表现较差,24%(24/100)的建模结果中TSS和Kappa值为0。选取AUC值为> 0.950、TSS值为> 0.850的79个RF模型、38个MaxEnt模型及其集合模型的建模结果进行环境变量重要性评价。在RF模型、MaxEnt模型及其集合模型中,年日照时数(SSD)的重要性分别为32.989%、37.847%和46.315%,年平均相对湿度(RHU)的重要性分别为30.947%、15.921%和28.121%。重要的环境变量集中在RF模型的建模结果中,分散在MaxEnt模型的建模结果中,最集中在ensemble模型的建模结果中。RF模型预测2016年后湖北钉螺的潜在分布相对集中在云南省,而MaxEnt模型预测的潜在分布相对较大,集合模型预测的湖北钉螺分布多为RF和MaxEnt模型预测的联合分布。结论:RF模型和MaxEnt模型均能有效预测云南省湖北钉螺的潜在分布,为针对性防治提供依据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
[Prediction of potential geographic distribution of Oncomelania hupensis in Yunnan Province using random forest and maximum entropy models].

Objective: To predict the potential geographic distribution of Oncomelania hupensis in Yunnan Province using random forest (RF) and maximum entropy (MaxEnt) models, so as to provide insights into O. hupensis surveillance and control in Yunnan Province.

Methods: The O. hupensis snail survey data in Yunnan Province from 2015 to 2016 were collected and converted into O. hupensis snail distribution site data. Data of 22 environmental variables in Yunnan Province were collected, including twelve climate variables (annual potential evapotranspiration, annual mean ground surface temperature, annual precipitation, annual mean air pressure, annual mean relative humidity, annual sunshine duration, annual mean air temperature, annual mean wind speed, ≥ 0 ℃ annual accumulated temperature, ≥ 10 ℃ annual accumulated temperature, aridity and index of moisture), eight geographical variables (normalized difference vegetation index, landform type, land use type, altitude, soil type, soil textureclay content, soil texture-sand content and soil texture-silt content) and two population and economic variables (gross domestic product and population). Variables were screened with Pearson correlation test and variance inflation factor (VIF) test. The RF and MaxEnt models and the ensemble model were created using the biomod2 package of the software R 4.2.1, and the potential distribution of O. hupensis snails after 2016 was predicted in Yunnan Province. The predictive effects of models were evaluated through cross-validation and independent tests, and the area under the receiver operating characteristic curve (AUC), true skill statistics (TSS) and Kappa statistics were used for model evaluation. In addition, the importance of environmental variables was analyzed, the contribution of environmental variables output by the models with AUC values of > 0.950 and TSS values of > 0.850 were selected for normalization processing, and the importance percentage of environmental variables was obtained to analyze the importance of environmental variables.

Results: Data of 148 O. hupensis snail distribution sites and 15 environmental variables were included in training sets of RF and MaxEnt models, and both RF and MaxEnt models had high predictive performance, with both mean AUC values of > 0.900 and all mean TSS values and Kappa values of > 0.800, and significant differences in the AUC (t = 19.862, P < 0.05), TSS (t = 10.140, P < 0.05) and Kappa values (t = 10.237, P < 0.05) between two models. The AUC, TSS and Kappa values of the ensemble model were 0.996, 0.954 and 0.920, respectively. Independent data verification showed that the AUC, TSS and Kappa values of the RF model and the ensemble model were all 1, which still showed high performance in unknown data modeling, and the MaxEnt model showed poor performance, with TSS and Kappa values of 0 for 24%(24/100) of the modeling results. The modeling results of 79 RF models, 38 MaxEnt models and their ensemble models with AUC values of > 0.950 and TSS values of > 0.850 were included in the evaluation of importance of environmental variables. The importance of annual sunshine duration (SSD) was 32.989%, 37.847% and 46.315% in the RF model, the MaxEnt model and their ensemble model, while the importance of annual mean relative humidity (RHU) was 30.947%, 15.921% and 28.121%, respectively. Important environment variables were concentrated in modeling results of the RF model, dispersed in modeling results of the MaxEnt model, and most concentrated in modeling results of the ensemble model. The potential distribution of O. hupensis snails after 2016 was predicted to be relatively concentrated in Yunnan Province by the RF model and relatively large by the MaxEnt model, and the distribution of O. hupensis snails predicted by the ensemble model was mostly the joint distribution of O. hupensis snails predicted by RF and MaxEnt models.

Conclusions: Both RF and MaxEnt models are effective to predict the potential distribution of O. hupensis snails in Yunnan Province, which facilitates targeted O. hupensis snail control.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
中国血吸虫病防治杂志
中国血吸虫病防治杂志 Medicine-Medicine (all)
CiteScore
1.30
自引率
0.00%
发文量
7021
期刊介绍: Chinese Journal of Schistosomiasis Control (ISSN: 1005-6661, CN: 32-1374/R), founded in 1989, is a technical and scientific journal under the supervision of Jiangsu Provincial Health Commission and organised by Jiangsu Institute of Schistosomiasis Control. It is a scientific and technical journal under the supervision of Jiangsu Provincial Health Commission and sponsored by Jiangsu Institute of Schistosomiasis Prevention and Control. The journal carries out the policy of prevention-oriented, control-oriented, nationwide and grassroots, adheres to the tenet of scientific research service for the prevention and treatment of schistosomiasis and other parasitic diseases, and mainly publishes academic papers reflecting the latest achievements and dynamics of prevention and treatment of schistosomiasis and other parasitic diseases, scientific research and management, etc. The main columns are Guest Contributions, Experts‘ Commentary, Experts’ Perspectives, Experts' Forums, Theses, Prevention and Treatment Research, Experimental Research, The main columns include Guest Contributions, Expert Commentaries, Expert Perspectives, Expert Forums, Treatises, Prevention and Control Studies, Experimental Studies, Clinical Studies, Prevention and Control Experiences, Prevention and Control Management, Reviews, Case Reports, and Information, etc. The journal is a useful reference material for the professional and technical personnel of schistosomiasis and parasitic disease prevention and control research, management workers, and teachers and students of medical schools.    The journal is now included in important domestic databases, such as Chinese Core List (8th edition), China Science Citation Database (Core Edition), China Science and Technology Core Journals (Statistical Source Journals), and is also included in MEDLINE/PubMed, Scopus, EBSCO, Chemical Abstract, Embase, Zoological Record, JSTChina, Ulrichsweb, Western Pacific Region Index Medicus, CABI and other international authoritative databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信