Giorgio Montesi, Simone Costagli, Simone Lucchesi, Jacopo Polvere, Fabio Fiorino, Gabiria Pastore, Margherita Sambo, Mario Tumbarello, Massimiliano Fabbiani, Francesca Montagnani, Donata Medaglini, Elena Pettini, Annalisa Ciabattini
{"title":"Machine learning approaches to dissect hybrid and vaccine-induced immunity.","authors":"Giorgio Montesi, Simone Costagli, Simone Lucchesi, Jacopo Polvere, Fabio Fiorino, Gabiria Pastore, Margherita Sambo, Mario Tumbarello, Massimiliano Fabbiani, Francesca Montagnani, Donata Medaglini, Elena Pettini, Annalisa Ciabattini","doi":"10.1038/s43856-025-00987-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The spread of SARS-CoV-2 Omicron variant and its subvariants, highly transmissible but responsible of milder disease, has increased unreported infection cases. Identifying unaware infected individuals is crucial for estimating the true prevalence of infection and evaluating the breadth of hybrid immunity. In this study, this challenge was addressed by applying several Machine Learning approaches.</p><p><strong>Methods: </strong>A group of 116 participants, vaccinated against SARS-CoV-2, was enrolled in the IMMUNO_COV study at Siena University Hospital, Italy. Blood samples were collected before and six months after third vaccine dose. Machine Learning analysis, involving dimensionality reduction techniques, unsupervised clustering methods and classification models, were applied to serological data including antibody responses specific for wild type SARS-CoV-2 strain as well as Delta, Omicron BA.1 and Omicron BA.2 variants. Spike- and nucleocapsid-specific B cells were also assessed in each participant.</p><p><strong>Results: </strong>Using dimensionality reduction and unsupervised clustering, participants are grouped into high- and low-responders, with infected participants mainly distributed within the high-responders. Implementation of a consensus-based approach, including k-NN, RF, and SVM models, identifies 14 participants unaware of previous infection. Their immunological profile, characterized by strong spike- and nucleocapsid-specific humoral and B cell responses, significantly differs from that of non-infected participants.</p><p><strong>Conclusions: </strong>Machine Learning approaches are applied to identify participants unaware of prior infection and to dissect their hybrid immunity profiles. Based on serological data, this cost-effective method can be a valuable tool for estimating the true prevalence of infection, improving comprehension of immune responses elicited by vaccination alone or combined with infection, and tailoring public health interventions.</p>","PeriodicalId":72646,"journal":{"name":"Communications medicine","volume":"5 1","pages":"282"},"PeriodicalIF":5.4000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12238572/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s43856-025-00987-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The spread of SARS-CoV-2 Omicron variant and its subvariants, highly transmissible but responsible of milder disease, has increased unreported infection cases. Identifying unaware infected individuals is crucial for estimating the true prevalence of infection and evaluating the breadth of hybrid immunity. In this study, this challenge was addressed by applying several Machine Learning approaches.
Methods: A group of 116 participants, vaccinated against SARS-CoV-2, was enrolled in the IMMUNO_COV study at Siena University Hospital, Italy. Blood samples were collected before and six months after third vaccine dose. Machine Learning analysis, involving dimensionality reduction techniques, unsupervised clustering methods and classification models, were applied to serological data including antibody responses specific for wild type SARS-CoV-2 strain as well as Delta, Omicron BA.1 and Omicron BA.2 variants. Spike- and nucleocapsid-specific B cells were also assessed in each participant.
Results: Using dimensionality reduction and unsupervised clustering, participants are grouped into high- and low-responders, with infected participants mainly distributed within the high-responders. Implementation of a consensus-based approach, including k-NN, RF, and SVM models, identifies 14 participants unaware of previous infection. Their immunological profile, characterized by strong spike- and nucleocapsid-specific humoral and B cell responses, significantly differs from that of non-infected participants.
Conclusions: Machine Learning approaches are applied to identify participants unaware of prior infection and to dissect their hybrid immunity profiles. Based on serological data, this cost-effective method can be a valuable tool for estimating the true prevalence of infection, improving comprehension of immune responses elicited by vaccination alone or combined with infection, and tailoring public health interventions.