Sebastian Cubides-Cely, Betsy Muñoz Serrano, Enrique Mejía-Ospino, Patricia Escobar
{"title":"A predictive model developed to classify Leishmania promastigotes at two distinct life stages using MALDI-TOF mass spectrometry","authors":"Sebastian Cubides-Cely, Betsy Muñoz Serrano, Enrique Mejía-Ospino, Patricia Escobar","doi":"10.1007/s00203-025-04424-x","DOIUrl":null,"url":null,"abstract":"<div><p>Investigating the molecular differences between procyclic (non-infective) and metacyclic (infective) promastigotes is essential for understanding the <i>Leishmania</i> life cycle in the sandfly vector and may aid in identifying molecular markers specific to these parasite stages. MALDI-TOF MS, a robust mass spectrometry technique, identifies protein profiles by measuring their mass-to-charge (<i>m/z</i>) ratios. Machine learning (ML) aids in analysing, interpreting, and classifying the complex spectral dataset obtained. This research aims to develop a predictive model to classify procyclic and metacyclic stages based on their protein profiles obtained from MALDI-TOF MS spectra. Promastigotes from the two clones of <i>L. amazonensis</i>, previously typed by molecular approach, were cultured and collected on days 3 and 7 of growth at 27 °C. Our data included at least 10 biological replicates, each in triplicate, for each <i>L. amazonensis</i> clone. They were labelled as Clone1LB3D, Clone1LB7D, Clone2LP3D, and Clone2LP7D. Three supervised classification tools were utilised: support vector machine (SVM), artificial neural networks (ANN), and random forest (RF). The implementation was carried out using Python version 3.12. The predictor variables correspond to the intensities of the spectral signals of the parasites in the <i>m/z</i> ratio range of 600 to 9500. The SVM classifier achieved 100% accuracy, while ANN and RF achieved 95% and 85%, respectively. A confusion matrix confirmed the complete accuracy of SVM across clones and stages. For model robustness, we recommend conducting external validation using independent datasets, including those from different <i>L. amazonensis</i> clones and related <i>Leishmania</i> species, growth phases, and sample preparation methods.</p></div>","PeriodicalId":8279,"journal":{"name":"Archives of Microbiology","volume":"207 11","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Microbiology","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s00203-025-04424-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Investigating the molecular differences between procyclic (non-infective) and metacyclic (infective) promastigotes is essential for understanding the Leishmania life cycle in the sandfly vector and may aid in identifying molecular markers specific to these parasite stages. MALDI-TOF MS, a robust mass spectrometry technique, identifies protein profiles by measuring their mass-to-charge (m/z) ratios. Machine learning (ML) aids in analysing, interpreting, and classifying the complex spectral dataset obtained. This research aims to develop a predictive model to classify procyclic and metacyclic stages based on their protein profiles obtained from MALDI-TOF MS spectra. Promastigotes from the two clones of L. amazonensis, previously typed by molecular approach, were cultured and collected on days 3 and 7 of growth at 27 °C. Our data included at least 10 biological replicates, each in triplicate, for each L. amazonensis clone. They were labelled as Clone1LB3D, Clone1LB7D, Clone2LP3D, and Clone2LP7D. Three supervised classification tools were utilised: support vector machine (SVM), artificial neural networks (ANN), and random forest (RF). The implementation was carried out using Python version 3.12. The predictor variables correspond to the intensities of the spectral signals of the parasites in the m/z ratio range of 600 to 9500. The SVM classifier achieved 100% accuracy, while ANN and RF achieved 95% and 85%, respectively. A confusion matrix confirmed the complete accuracy of SVM across clones and stages. For model robustness, we recommend conducting external validation using independent datasets, including those from different L. amazonensis clones and related Leishmania species, growth phases, and sample preparation methods.
期刊介绍:
Research papers must make a significant and original contribution to
microbiology and be of interest to a broad readership. The results of any
experimental approach that meets these objectives are welcome, particularly
biochemical, molecular genetic, physiological, and/or physical investigations into
microbial cells and their interactions with their environments, including their eukaryotic hosts.
Mini-reviews in areas of special topical interest and papers on medical microbiology, ecology and systematics, including description of novel taxa, are also published.
Theoretical papers and those that report on the analysis or ''mining'' of data are
acceptable in principle if new information, interpretations, or hypotheses
emerge.