在非目标筛选中，实验投影方法是否优于保留时间预测模型？LC/HRMS实验室间对比数据的案例研究

IF 3.3 3区化学 Q2 CHEMISTRY, ANALYTICAL

Analyst Pub Date : 2025-07-08 DOI:10.1039/D5AN00323G

Louise Malm and Anneli Kruve

{"title":"在非目标筛选中，实验投影方法是否优于保留时间预测模型？LC/HRMS实验室间对比数据的案例研究","authors":"Louise Malm and Anneli Kruve","doi":"10.1039/D5AN00323G","DOIUrl":null,"url":null,"abstract":"Retention time (RT) is essential in evaluating the likelihood of candidate structures in nontarget screening (NTS) with liquid chromatography high resolution mass spectrometry (LC/HRMS). Approaches for estimating the RTs of candidate structures can broadly be divided into projection and prediction methods. The first approach takes advantage of public databases of RTs measured on similar chromatographic systems (CSsource) and projects these to the chromatographic system applied in the NTS (CSNTS) based on a small set of commonly analyzed chemicals. The second approach leverages machine learning (ML) model(s) trained on publicly available retention time data measured on one or more chromatographic systems (CStraining). Nevertheless, the CSsource and CStraining might differ substantially from CSNTS. Therefore, it is of interest to evaluate the generalizability of projection models and prediction models in CSs routinely applied in NTS. Here we take advantage of the recent NORMAN interlaboratory comparison where 41 known calibration chemicals and 45 suspects were analyzed to evaluate both the projection and prediction approaches on 37 CSs. The accuracy of both approaches was directly linked to the similarity of the CS, and the pH of the mobile phase and the column chemistry were found to be most impactful. Furthermore, for cases where CSsource and CSNTS differ substantially but CStraining and CSNTS are similar, prediction models often performed on par with the projection models. These findings highlight the need to account for the mobile phase and column chemistry in ML model training and select the prediction model for RT.","PeriodicalId":63,"journal":{"name":"Analyst","volume":" 16","pages":" 3567-3577"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/an/d5an00323g?page=search","citationCount":"0","resultStr":"{\"title\":\"Do experimental projection methods outcompete retention time prediction models in non-target screening? A case study on LC/HRMS interlaboratory comparison data†\",\"authors\":\"Louise Malm and Anneli Kruve\",\"doi\":\"10.1039/D5AN00323G\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Retention time (RT) is essential in evaluating the likelihood of candidate structures in nontarget screening (NTS) with liquid chromatography high resolution mass spectrometry (LC/HRMS). Approaches for estimating the RTs of candidate structures can broadly be divided into projection and prediction methods. The first approach takes advantage of public databases of RTs measured on similar chromatographic systems (CSsource) and projects these to the chromatographic system applied in the NTS (CSNTS) based on a small set of commonly analyzed chemicals. The second approach leverages machine learning (ML) model(s) trained on publicly available retention time data measured on one or more chromatographic systems (CStraining). Nevertheless, the CSsource and CStraining might differ substantially from CSNTS. Therefore, it is of interest to evaluate the generalizability of projection models and prediction models in CSs routinely applied in NTS. Here we take advantage of the recent NORMAN interlaboratory comparison where 41 known calibration chemicals and 45 suspects were analyzed to evaluate both the projection and prediction approaches on 37 CSs. The accuracy of both approaches was directly linked to the similarity of the CS, and the pH of the mobile phase and the column chemistry were found to be most impactful. Furthermore, for cases where CSsource and CSNTS differ substantially but CStraining and CSNTS are similar, prediction models often performed on par with the projection models. These findings highlight the need to account for the mobile phase and column chemistry in ML model training and select the prediction model for RT.\",\"PeriodicalId\":63,\"journal\":{\"name\":\"Analyst\",\"volume\":\" 16\",\"pages\":\" 3567-3577\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.rsc.org/en/content/articlepdf/2025/an/d5an00323g?page=search\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analyst\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/an/d5an00323g\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analyst","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/an/d5an00323g","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

摘要

在液相色谱-高分辨率质谱（LC/HRMS）非靶筛选（NTS）中，保留时间（RT）是评价候选结构可能性的关键。估计候选结构RT的方法大致可分为投影法和预测法。第一种方法利用在类似色谱系统（CSsource）上测量的RTs的公共数据库，并基于一小部分常用分析的化学物质，将这些数据投射到NTS （CSNTS）中应用的色谱系统。第二种方法利用在一个或多个色谱系统（限制）上测量的公开可用保留时间数据上训练的机器学习（ML）模型。然而，CSsource和constraints可能与CSNTS有很大的不同。因此，评估常规应用于NTS的CSs中的投影模型和预测模型的泛化性是有意义的。在这里，我们利用最近的NORMAN实验室间比较，分析了41种已知的校准化学物质和45种可疑化学物质，以评估37种CSs的预测和预测方法。两种方法的准确性与CS的相似性直接相关，流动相的pH值以及柱化学被发现是最具影响力的。此外，对于CSsource和CSNTS差异较大，而CSNTS和constrained相似的情况，预测模型通常使用投影模型在bar上执行。这些发现突出了在ML模型训练和RT预测模型选择中考虑流动相和柱化学的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Do experimental projection methods outcompete retention time prediction models in non-target screening? A case study on LC/HRMS interlaboratory comparison data†

查看原文本刊更多论文

Do experimental projection methods outcompete retention time prediction models in non-target screening? A case study on LC/HRMS interlaboratory comparison data†

Retention time (RT) is essential in evaluating the likelihood of candidate structures in nontarget screening (NTS) with liquid chromatography high resolution mass spectrometry (LC/HRMS). Approaches for estimating the RTs of candidate structures can broadly be divided into projection and prediction methods. The first approach takes advantage of public databases of RTs measured on similar chromatographic systems (CS_source) and projects these to the chromatographic system applied in the NTS (CS_NTS) based on a small set of commonly analyzed chemicals. The second approach leverages machine learning (ML) model(s) trained on publicly available retention time data measured on one or more chromatographic systems (CS_training). Nevertheless, the CS_source and CS_training might differ substantially from CS_NTS. Therefore, it is of interest to evaluate the generalizability of projection models and prediction models in CSs routinely applied in NTS. Here we take advantage of the recent NORMAN interlaboratory comparison where 41 known calibration chemicals and 45 suspects were analyzed to evaluate both the projection and prediction approaches on 37 CSs. The accuracy of both approaches was directly linked to the similarity of the CS, and the pH of the mobile phase and the column chemistry were found to be most impactful. Furthermore, for cases where CS_source and CS_NTS differ substantially but CS_training and CS_NTS are similar, prediction models often performed on par with the projection models. These findings highlight the need to account for the mobile phase and column chemistry in ML model training and select the prediction model for RT.