Examining the effects of analytical replication on data quality in a non-targeted analysis experiment.

IF 3.8 2区 化学 Q1 BIOCHEMICAL RESEARCH METHODS
Troy M Ferland, Heather D Whitehead, Timothy J Buckley, Alex Chao, Jeffrey M Minucci, E Tyler Carr, Greg Janesch, Safia Rizwan, Nathaniel Charest, Antony J Williams, James P McCord, Jon R Sobus
{"title":"Examining the effects of analytical replication on data quality in a non-targeted analysis experiment.","authors":"Troy M Ferland, Heather D Whitehead, Timothy J Buckley, Alex Chao, Jeffrey M Minucci, E Tyler Carr, Greg Janesch, Safia Rizwan, Nathaniel Charest, Antony J Williams, James P McCord, Jon R Sobus","doi":"10.1007/s00216-025-05940-x","DOIUrl":null,"url":null,"abstract":"<p><p>Non-targeted analysis (NTA) methods are integral to environmental monitoring given their ability to expand measurable chemical space beyond that of traditional targeted methods. Such vast quantities of NTA data are generated that exhaustive manual review is generally unfeasible. Computational tools facilitate automated data processing, but cannot always distinguish real signals (i.e., originating from a chemical in a sample) from artifacts. Replicate analysis is recommended to aid data review, but as NTA studies become larger, the cost of analytical replication becomes untenable. A need therefore exists for examination of information penalties associated with reduced replication. To investigate this issue, using an existing NTA dataset, we performed over 70,000 simulations of variable replication designs and calculated false discovery rates (FDRs) and false negative rates (FNRs) for NTA features and occurrences. We used regression models to explore associations between replication percentage and FDR/FNR, and to test whether rates were affected by NTA feature attributes. Inverse relationships were generally observed between replication percentage and FDR/FNR, such that lower replication yielded higher information penalties. Significant increases in FDR/FNR were observed for suspected per- and polyfluoroalkyl substances (PFAS) compared to non-PFAS, highlighting the potential for differences in information penalties across feature groups. Specific quantitative information penalties are expected to be unique for each NTA study based on sample type and workflow. The methods presented here can support future pilot-scale investigations that will inform the required level of replication in full-scale studies.</p>","PeriodicalId":462,"journal":{"name":"Analytical and Bioanalytical Chemistry","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical and Bioanalytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s00216-025-05940-x","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Non-targeted analysis (NTA) methods are integral to environmental monitoring given their ability to expand measurable chemical space beyond that of traditional targeted methods. Such vast quantities of NTA data are generated that exhaustive manual review is generally unfeasible. Computational tools facilitate automated data processing, but cannot always distinguish real signals (i.e., originating from a chemical in a sample) from artifacts. Replicate analysis is recommended to aid data review, but as NTA studies become larger, the cost of analytical replication becomes untenable. A need therefore exists for examination of information penalties associated with reduced replication. To investigate this issue, using an existing NTA dataset, we performed over 70,000 simulations of variable replication designs and calculated false discovery rates (FDRs) and false negative rates (FNRs) for NTA features and occurrences. We used regression models to explore associations between replication percentage and FDR/FNR, and to test whether rates were affected by NTA feature attributes. Inverse relationships were generally observed between replication percentage and FDR/FNR, such that lower replication yielded higher information penalties. Significant increases in FDR/FNR were observed for suspected per- and polyfluoroalkyl substances (PFAS) compared to non-PFAS, highlighting the potential for differences in information penalties across feature groups. Specific quantitative information penalties are expected to be unique for each NTA study based on sample type and workflow. The methods presented here can support future pilot-scale investigations that will inform the required level of replication in full-scale studies.

在非目标分析实验中检验分析复制对数据质量的影响。
非目标分析(NTA)方法是环境监测中不可或缺的一部分,因为它们能够在传统的目标方法之外扩展可测量的化学空间。产生了如此大量的NTA数据,详尽的人工审查通常是不可行的。计算工具有助于自动化数据处理,但不能总是区分真实信号(即来自样品中的化学物质)和伪信号。重复分析建议帮助数据审查,但随着NTA研究规模的扩大,分析复制的成本变得站不住脚。因此,有必要审查与减少复制有关的信息惩罚。为了研究这一问题,我们使用现有的NTA数据集,对变量复制设计进行了超过70,000次模拟,并计算了NTA特征和出现的错误发现率(FDRs)和错误阴性率(fnr)。我们使用回归模型来探索复制百分比与FDR/FNR之间的关系,并测试速率是否受到NTA特征属性的影响。在复制百分比和FDR/FNR之间通常观察到反比关系,例如较低的复制产生较高的信息惩罚。观察到疑似全氟烷基和多氟烷基物质(PFAS)的FDR/FNR与非PFAS相比显著增加,突出了不同特征组之间信息惩罚的可能差异。根据样本类型和工作流程,预计每个NTA研究的具体定量信息处罚是独一无二的。这里提出的方法可以支持未来的试点规模调查,这将为全面研究所需的复制水平提供信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.00
自引率
4.70%
发文量
638
审稿时长
2.1 months
期刊介绍: Analytical and Bioanalytical Chemistry’s mission is the rapid publication of excellent and high-impact research articles on fundamental and applied topics of analytical and bioanalytical measurement science. Its scope is broad, and ranges from novel measurement platforms and their characterization to multidisciplinary approaches that effectively address important scientific problems. The Editors encourage submissions presenting innovative analytical research in concept, instrumentation, methods, and/or applications, including: mass spectrometry, spectroscopy, and electroanalysis; advanced separations; analytical strategies in “-omics” and imaging, bioanalysis, and sampling; miniaturized devices, medical diagnostics, sensors; analytical characterization of nano- and biomaterials; chemometrics and advanced data analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信