Troy M Ferland, Heather D Whitehead, Timothy J Buckley, Alex Chao, Jeffrey M Minucci, E Tyler Carr, Greg Janesch, Safia Rizwan, Nathaniel Charest, Antony J Williams, James P McCord, Jon R Sobus
{"title":"Examining the effects of analytical replication on data quality in a non-targeted analysis experiment.","authors":"Troy M Ferland, Heather D Whitehead, Timothy J Buckley, Alex Chao, Jeffrey M Minucci, E Tyler Carr, Greg Janesch, Safia Rizwan, Nathaniel Charest, Antony J Williams, James P McCord, Jon R Sobus","doi":"10.1007/s00216-025-05940-x","DOIUrl":null,"url":null,"abstract":"<p><p>Non-targeted analysis (NTA) methods are integral to environmental monitoring given their ability to expand measurable chemical space beyond that of traditional targeted methods. Such vast quantities of NTA data are generated that exhaustive manual review is generally unfeasible. Computational tools facilitate automated data processing, but cannot always distinguish real signals (i.e., originating from a chemical in a sample) from artifacts. Replicate analysis is recommended to aid data review, but as NTA studies become larger, the cost of analytical replication becomes untenable. A need therefore exists for examination of information penalties associated with reduced replication. To investigate this issue, using an existing NTA dataset, we performed over 70,000 simulations of variable replication designs and calculated false discovery rates (FDRs) and false negative rates (FNRs) for NTA features and occurrences. We used regression models to explore associations between replication percentage and FDR/FNR, and to test whether rates were affected by NTA feature attributes. Inverse relationships were generally observed between replication percentage and FDR/FNR, such that lower replication yielded higher information penalties. Significant increases in FDR/FNR were observed for suspected per- and polyfluoroalkyl substances (PFAS) compared to non-PFAS, highlighting the potential for differences in information penalties across feature groups. Specific quantitative information penalties are expected to be unique for each NTA study based on sample type and workflow. The methods presented here can support future pilot-scale investigations that will inform the required level of replication in full-scale studies.</p>","PeriodicalId":462,"journal":{"name":"Analytical and Bioanalytical Chemistry","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical and Bioanalytical Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s00216-025-05940-x","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Non-targeted analysis (NTA) methods are integral to environmental monitoring given their ability to expand measurable chemical space beyond that of traditional targeted methods. Such vast quantities of NTA data are generated that exhaustive manual review is generally unfeasible. Computational tools facilitate automated data processing, but cannot always distinguish real signals (i.e., originating from a chemical in a sample) from artifacts. Replicate analysis is recommended to aid data review, but as NTA studies become larger, the cost of analytical replication becomes untenable. A need therefore exists for examination of information penalties associated with reduced replication. To investigate this issue, using an existing NTA dataset, we performed over 70,000 simulations of variable replication designs and calculated false discovery rates (FDRs) and false negative rates (FNRs) for NTA features and occurrences. We used regression models to explore associations between replication percentage and FDR/FNR, and to test whether rates were affected by NTA feature attributes. Inverse relationships were generally observed between replication percentage and FDR/FNR, such that lower replication yielded higher information penalties. Significant increases in FDR/FNR were observed for suspected per- and polyfluoroalkyl substances (PFAS) compared to non-PFAS, highlighting the potential for differences in information penalties across feature groups. Specific quantitative information penalties are expected to be unique for each NTA study based on sample type and workflow. The methods presented here can support future pilot-scale investigations that will inform the required level of replication in full-scale studies.
期刊介绍:
Analytical and Bioanalytical Chemistry’s mission is the rapid publication of excellent and high-impact research articles on fundamental and applied topics of analytical and bioanalytical measurement science. Its scope is broad, and ranges from novel measurement platforms and their characterization to multidisciplinary approaches that effectively address important scientific problems. The Editors encourage submissions presenting innovative analytical research in concept, instrumentation, methods, and/or applications, including: mass spectrometry, spectroscopy, and electroanalysis; advanced separations; analytical strategies in “-omics” and imaging, bioanalysis, and sampling; miniaturized devices, medical diagnostics, sensors; analytical characterization of nano- and biomaterials; chemometrics and advanced data analysis.