Development, Evaluation, and Application of Machine Learning Models for Accurate Prediction of Root Uptake of Per- and Polyfluoroalkyl Substances

IF 11.3 1区 环境科学与生态学 Q1 ENGINEERING, ENVIRONMENTAL
Lei Xiang, Jing Qiu, Qian-Qi Chen, Peng-Fei Yu, Bai-Lin Liu, Hai-Ming Zhao, Yan-Wen Li, Nai-Xian Feng, Quan-Ying Cai, Ce-Hui Mo* and Qing X. Li, 
{"title":"Development, Evaluation, and Application of Machine Learning Models for Accurate Prediction of Root Uptake of Per- and Polyfluoroalkyl Substances","authors":"Lei Xiang,&nbsp;Jing Qiu,&nbsp;Qian-Qi Chen,&nbsp;Peng-Fei Yu,&nbsp;Bai-Lin Liu,&nbsp;Hai-Ming Zhao,&nbsp;Yan-Wen Li,&nbsp;Nai-Xian Feng,&nbsp;Quan-Ying Cai,&nbsp;Ce-Hui Mo* and Qing X. Li,&nbsp;","doi":"10.1021/acs.est.2c09788","DOIUrl":null,"url":null,"abstract":"<p >Machine learning (ML) models were developed for understanding the root uptake of per- and polyfluoroalkyl substances (PFASs) under complex PFAS-crop-soil interactions. Three hundred root concentration factor (RCF) data points and 26 features associated with PFAS structures, crop properties, soil properties, and cultivation conditions were used for the model development. The optimal ML model, obtained by stratified sampling, Bayesian optimization, and 5-fold cross-validation, was explained by permutation feature importance, individual conditional expectation plot, and 3D interaction plot. The results showed that soil organic carbon contents, pH, chemical logP, soil PFAS concentration, root protein contents, and exposure time greatly affected the root uptake of PFASs with 0.43, 0.25, 0.10, 0.05, 0.05, and 0.05 of relative importance, respectively. Furthermore, these factors presented the key threshold ranges in favor of the PFAS uptake. Carbon-chain length was identified as the critical molecular structure affecting root uptake of PFASs with 0.12 of relative importance, based on the extended connectivity fingerprints. A user-friendly model was established with symbolic regression for accurately predicting RCF values of the PFASs (including branched PFAS isomerides). The present study provides a novel approach for profound insight into the uptake of PFASs by crops under complex PFAS-crop-soil interactions, aiming to ensure food safety and human health.</p>","PeriodicalId":36,"journal":{"name":"环境科学与技术","volume":"57 46","pages":"18317–18328"},"PeriodicalIF":11.3000,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"环境科学与技术","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.est.2c09788","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 3

Abstract

Machine learning (ML) models were developed for understanding the root uptake of per- and polyfluoroalkyl substances (PFASs) under complex PFAS-crop-soil interactions. Three hundred root concentration factor (RCF) data points and 26 features associated with PFAS structures, crop properties, soil properties, and cultivation conditions were used for the model development. The optimal ML model, obtained by stratified sampling, Bayesian optimization, and 5-fold cross-validation, was explained by permutation feature importance, individual conditional expectation plot, and 3D interaction plot. The results showed that soil organic carbon contents, pH, chemical logP, soil PFAS concentration, root protein contents, and exposure time greatly affected the root uptake of PFASs with 0.43, 0.25, 0.10, 0.05, 0.05, and 0.05 of relative importance, respectively. Furthermore, these factors presented the key threshold ranges in favor of the PFAS uptake. Carbon-chain length was identified as the critical molecular structure affecting root uptake of PFASs with 0.12 of relative importance, based on the extended connectivity fingerprints. A user-friendly model was established with symbolic regression for accurately predicting RCF values of the PFASs (including branched PFAS isomerides). The present study provides a novel approach for profound insight into the uptake of PFASs by crops under complex PFAS-crop-soil interactions, aiming to ensure food safety and human health.

Abstract Image

全氟和多氟烷基物质根吸收准确预测机器学习模型的开发、评估和应用
开发了机器学习(ML)模型,以了解全氟烷基物质和多氟烷基物质(PFASs)在复杂的PFASs -作物-土壤相互作用下的根吸收。300个根浓度因子(RCF)数据点和26个与PFAS结构、作物特性、土壤特性和栽培条件相关的特征用于模型开发。通过分层抽样、贝叶斯优化和5倍交叉验证得到的最优ML模型,通过排列特征重要性、个体条件期望图和三维交互图来解释。结果表明,土壤有机碳含量、pH、化学logP、土壤PFAS浓度、根系蛋白质含量和暴露时间对根系吸收PFAS的影响显著,影响程度分别为0.43、0.25、0.10、0.05、0.05和0.05。此外,这些因素呈现出有利于PFAS摄取的关键阈值范围。基于扩展的连接指纹图谱,碳链长度被确定为影响PFASs根系吸收的关键分子结构,相对重要性为0.12。用符号回归建立了一个用户友好的模型,用于准确预测PFAS(包括支链PFAS异构体)的RCF值。本研究为深入了解PFASs -作物-土壤复杂相互作用下作物对PFASs的吸收提供了一种新的途径,旨在确保食品安全和人类健康。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
环境科学与技术
环境科学与技术 环境科学-工程:环境
CiteScore
17.50
自引率
9.60%
发文量
12359
审稿时长
2.8 months
期刊介绍: Environmental Science & Technology (ES&T) is a co-sponsored academic and technical magazine by the Hubei Provincial Environmental Protection Bureau and the Hubei Provincial Academy of Environmental Sciences. Environmental Science & Technology (ES&T) holds the status of Chinese core journals, scientific papers source journals of China, Chinese Science Citation Database source journals, and Chinese Academic Journal Comprehensive Evaluation Database source journals. This publication focuses on the academic field of environmental protection, featuring articles related to environmental protection and technical advancements.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信