Application of a Deep Learning Model to Predict Liquid Chromatography Retention Times of Food Peptides Across Chromatographic Conditions

IF 2.8 3区工程技术 Q2 CHEMISTRY, ANALYTICAL

Journal of separation science Pub Date : 2025-09-24 DOI:10.1002/jssc.70270

Boudewijn Hollebrands, Jos Hageman, Hans-Gerd Janssen

{"title":"Application of a Deep Learning Model to Predict Liquid Chromatography Retention Times of Food Peptides Across Chromatographic Conditions","authors":"Boudewijn Hollebrands, Jos Hageman, Hans-Gerd Janssen","doi":"10.1002/jssc.70270","DOIUrl":null,"url":null,"abstract":"<div>\n \n Comparing predicted and measured retention times can greatly enhance the reliability of peptide identification in LC-MS analysis of smaller, food-derived peptides where MS spectral information alone is often insufficient. Unfortunately, the extensive data sets of peptide retention times from proteomics repositories, or prediction models derived from them, have limited applicability to food-derived peptides due to the structural diversity of these peptides. To address this, we applied a transfer learning approach by fine-tuning a generic deep learning model initially trained on large proteomics datasets using our own experimental data obtained from commercial peptide standards.\n The method utilizes an easy to implement retraining strategy that significantly reduces data requirements and training time compared to building a model from scratch. The retrained model demonstrated strong predictive performance (Q2 > 0.98), and 95% of the retention time predictions of a yeast protein hydrolysate validation set fell within a ±1.0 min window across a wide range of chromatographic conditions, demonstrating both its robustness and practical relevance. We further validated this approach by applying it to the analysis of plant protein hydrolysates. The good performance seen showed its versatility and applicability for diverse sets of peptides including tryptic and non-tryptic peptides.\n Our work underscores the potential of transfer learning in chromatographic analysis, providing an efficient and adaptable tool for rapid and reliable peptide analysis in food research. Transfer learning enabled the utilization of extensive databases from the proteomics area in the much narrower and specialized field of food peptide analysis.\n </div>","PeriodicalId":17098,"journal":{"name":"Journal of separation science","volume":"48 9","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of separation science","FirstCategoryId":"5","ListUrlMain":"https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/10.1002/jssc.70270","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Comparing predicted and measured retention times can greatly enhance the reliability of peptide identification in LC-MS analysis of smaller, food-derived peptides where MS spectral information alone is often insufficient. Unfortunately, the extensive data sets of peptide retention times from proteomics repositories, or prediction models derived from them, have limited applicability to food-derived peptides due to the structural diversity of these peptides. To address this, we applied a transfer learning approach by fine-tuning a generic deep learning model initially trained on large proteomics datasets using our own experimental data obtained from commercial peptide standards.

The method utilizes an easy to implement retraining strategy that significantly reduces data requirements and training time compared to building a model from scratch. The retrained model demonstrated strong predictive performance (Q² > 0.98), and 95% of the retention time predictions of a yeast protein hydrolysate validation set fell within a ±1.0 min window across a wide range of chromatographic conditions, demonstrating both its robustness and practical relevance. We further validated this approach by applying it to the analysis of plant protein hydrolysates. The good performance seen showed its versatility and applicability for diverse sets of peptides including tryptic and non-tryptic peptides.

Our work underscores the potential of transfer learning in chromatographic analysis, providing an efficient and adaptable tool for rapid and reliable peptide analysis in food research. Transfer learning enabled the utilization of extensive databases from the proteomics area in the much narrower and specialized field of food peptide analysis.

Abstract Image

查看原文本刊更多论文

应用深度学习模型预测不同色谱条件下食品肽的液相色谱保留时间。

比较预测的保留时间和测量的保留时间可以大大提高LC-MS分析中较小的食物来源的肽鉴定的可靠性，而单靠MS谱信息往往是不够的。不幸的是，由于这些肽的结构多样性，来自蛋白质组学知识库的肽保留时间的大量数据集或由此衍生的预测模型对食物来源的肽的适用性有限。为了解决这个问题，我们应用了一种迁移学习方法，通过微调通用深度学习模型，该模型最初是在大型蛋白质组学数据集上训练的，使用我们自己从商业肽标准中获得的实验数据。与从头开始构建模型相比，该方法利用了易于实现的再训练策略，大大减少了数据需求和训练时间。重新训练的模型显示出强大的预测性能（Q2 > 0.98），在广泛的色谱条件下，酵母蛋白水解物验证集95%的保留时间预测落在±1.0 min的窗口内，证明了其稳健性和实际相关性。我们通过将其应用于植物蛋白水解物的分析进一步验证了该方法。所见的良好性能表明其通用性和适用性不同的肽集，包括色氨酸和非色氨酸。我们的工作强调了迁移学习在色谱分析中的潜力，为食品研究中快速可靠的多肽分析提供了一种高效、适应性强的工具。迁移学习使蛋白质组学领域的大量数据库能够在更狭窄和专业化的食品肽分析领域得到利用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of separation science 化学-分析化学

CiteScore

6.30

自引率

16.10%

发文量

408

审稿时长

1.8 months

期刊介绍： The Journal of Separation Science (JSS) is the most comprehensive source in separation science, since it covers all areas of chromatographic and electrophoretic separation methods in theory and practice, both in the analytical and in the preparative mode, solid phase extraction, sample preparation, and related techniques. Manuscripts on methodological or instrumental developments, including detection aspects, in particular mass spectrometry, as well as on innovative applications will also be published. Manuscripts on hyphenation, automation, and miniaturization are particularly welcome. Pre- and post-separation facets of a total analysis may be covered as well as the underlying logic of the development or application of a method.