Matei Ionita, Michelle L McKeague, Mark M Painter, Divij Mathew, Ajinkya Pattekar, Ayman Rezk, Shwetank, Damian Maseda, E John Wherry, Allison R Greenplate
{"title":"Cleanet: Robust Doublet Detection in Cytometry Data Based on Protein Expression Patterns.","authors":"Matei Ionita, Michelle L McKeague, Mark M Painter, Divij Mathew, Ajinkya Pattekar, Ayman Rezk, Shwetank, Damian Maseda, E John Wherry, Allison R Greenplate","doi":"10.1002/cyto.a.24961","DOIUrl":null,"url":null,"abstract":"<p><p>Flow and mass cytometry experiments are essential for profiling immune cells at single-cell resolution. Better understanding of human immunology increasingly involves analyzing studies at the scale of hundreds or thousands of samples, with data analysis a significant bottleneck. This trend increases the demand for automated analysis methods. In particular, a common preprocessing step in cytometry data analysis is distinguishing single cells from doublets (or multiplets), events in which two (or more) cells pass simultaneously through the detector. Typically, doublets are identified on two-dimensional density plots, using their high measured values for DNA intercalators (mass cytometry) or scattering channels (flow cytometry). Despite its popularity, this bivariate gating method is sometimes imprecise: for example, we show that bivariate gating of mass cytometry data can mistake single eosinophils for doublets, due to their high DNA content. Taking inspiration from methods already used in single-cell transcriptomics, but not in the cytometry community, we propose an alternative approach. Our method, called Cleanet, first simulates doublet events, then identifies true events with protein expression similar to the simulated doublets. This simple method is completely automated and detects both homotypic and heterotypic doublets. We validate it in datasets acquired with mass and flow cytometry; moreover, we verify with imaging flow cytometry data from ImageStream and Discover A8 instruments that most events predicted to be doublets truly consist of multiple cells. Cleanet can also classify doublets based on their component cell types, which potentially enables the study of cell-cell interactions, mining extra information out of doublet events that would otherwise be discarded. As a proof of concept, we demonstrate that Cleanet can detect a treatment-specific increase in interactions between two cell lines. By automating doublet detection and classification, we aim to streamline the data analysis in large cytometry studies and provide a more accurate picture of both immune cell populations and cell-cell interactions.</p>","PeriodicalId":11068,"journal":{"name":"Cytometry Part A","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cytometry Part A","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/cyto.a.24961","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Flow and mass cytometry experiments are essential for profiling immune cells at single-cell resolution. Better understanding of human immunology increasingly involves analyzing studies at the scale of hundreds or thousands of samples, with data analysis a significant bottleneck. This trend increases the demand for automated analysis methods. In particular, a common preprocessing step in cytometry data analysis is distinguishing single cells from doublets (or multiplets), events in which two (or more) cells pass simultaneously through the detector. Typically, doublets are identified on two-dimensional density plots, using their high measured values for DNA intercalators (mass cytometry) or scattering channels (flow cytometry). Despite its popularity, this bivariate gating method is sometimes imprecise: for example, we show that bivariate gating of mass cytometry data can mistake single eosinophils for doublets, due to their high DNA content. Taking inspiration from methods already used in single-cell transcriptomics, but not in the cytometry community, we propose an alternative approach. Our method, called Cleanet, first simulates doublet events, then identifies true events with protein expression similar to the simulated doublets. This simple method is completely automated and detects both homotypic and heterotypic doublets. We validate it in datasets acquired with mass and flow cytometry; moreover, we verify with imaging flow cytometry data from ImageStream and Discover A8 instruments that most events predicted to be doublets truly consist of multiple cells. Cleanet can also classify doublets based on their component cell types, which potentially enables the study of cell-cell interactions, mining extra information out of doublet events that would otherwise be discarded. As a proof of concept, we demonstrate that Cleanet can detect a treatment-specific increase in interactions between two cell lines. By automating doublet detection and classification, we aim to streamline the data analysis in large cytometry studies and provide a more accurate picture of both immune cell populations and cell-cell interactions.
期刊介绍:
Cytometry Part A, the journal of quantitative single-cell analysis, features original research reports and reviews of innovative scientific studies employing quantitative single-cell measurement, separation, manipulation, and modeling techniques, as well as original articles on mechanisms of molecular and cellular functions obtained by cytometry techniques.
The journal welcomes submissions from multiple research fields that fully embrace the study of the cytome:
Biomedical Instrumentation Engineering
Biophotonics
Bioinformatics
Cell Biology
Computational Biology
Data Science
Immunology
Parasitology
Microbiology
Neuroscience
Cancer
Stem Cells
Tissue Regeneration.