CytoBatchFlagR: A Comprehensive Framework to Objectively Assess High-Parameter Cytometry Data for Batch Effects.

IF 2.1 4区 生物学 Q3 BIOCHEMICAL RESEARCH METHODS
Shruti Eswar, Zachary T Koenig, Amanda R Tursi, José Cobeña-Reyes, Tamara Tilburgs, Sandra Andorf
{"title":"CytoBatchFlagR: A Comprehensive Framework to Objectively Assess High-Parameter Cytometry Data for Batch Effects.","authors":"Shruti Eswar, Zachary T Koenig, Amanda R Tursi, José Cobeña-Reyes, Tamara Tilburgs, Sandra Andorf","doi":"10.1002/cyto.a.70024","DOIUrl":null,"url":null,"abstract":"<p><p>Rapid advancements in mass and flow cytometry technologies have allowed researchers to generate and analyze high-dimensional single cell datasets, often utilizing upwards of 40 protein markers. Such high-parameter cytometry is increasingly used in longitudinal immunological studies, but technical variations across experimental batch runs can confound biological signals. To mitigate the impact on downstream analyses, many studies include reference control samples in every run, and several approaches exist to adjust for batch effects. However, tools that objectively identify problematic batches and markers present within a dataset are limited. We introduce CytoBatchFlagR, a comprehensive and interpretable tool designed to flag batch-related problems at the marker and cell cluster level based on robust statistical evaluations. Batch and marker variations are assessed based on median signal intensities of negative and positive cell populations and positive cell frequencies, along with Earth Mover's Distance (EMD) of signal intensity distributions. Additionally, CytoBatchFlagR identifies cell type specific batch problems via unsupervised clustering. The tool is suitable for mass and flow cytometry datasets where it objectively detects distinct types of batch issues. We developed and tested CytoBatchFlagR using three cytometry datasets to demonstrate its utility and performance. We also demonstrated CytoBatchFlagR's effectiveness in assessing datasets that include or lack reference controls. CytoBatchFlagR improves quality control by enabling objective identification of technical variations that may impact downstream analysis in high-parameter cytometry data. The tool uses a series of complementary metrics to identify potential batch-related problems at the marker and cell population level and presents the results through interpretable visualizations. This allows users to make informed decisions about whether to apply batch correction or exclude specific batches or markers from downstream analyses. CytoBatchFlagR is freely available as R scripts, with documentation and a tutorial to help users get started.</p>","PeriodicalId":11068,"journal":{"name":"Cytometry Part A","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2026-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cytometry Part A","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/cyto.a.70024","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Rapid advancements in mass and flow cytometry technologies have allowed researchers to generate and analyze high-dimensional single cell datasets, often utilizing upwards of 40 protein markers. Such high-parameter cytometry is increasingly used in longitudinal immunological studies, but technical variations across experimental batch runs can confound biological signals. To mitigate the impact on downstream analyses, many studies include reference control samples in every run, and several approaches exist to adjust for batch effects. However, tools that objectively identify problematic batches and markers present within a dataset are limited. We introduce CytoBatchFlagR, a comprehensive and interpretable tool designed to flag batch-related problems at the marker and cell cluster level based on robust statistical evaluations. Batch and marker variations are assessed based on median signal intensities of negative and positive cell populations and positive cell frequencies, along with Earth Mover's Distance (EMD) of signal intensity distributions. Additionally, CytoBatchFlagR identifies cell type specific batch problems via unsupervised clustering. The tool is suitable for mass and flow cytometry datasets where it objectively detects distinct types of batch issues. We developed and tested CytoBatchFlagR using three cytometry datasets to demonstrate its utility and performance. We also demonstrated CytoBatchFlagR's effectiveness in assessing datasets that include or lack reference controls. CytoBatchFlagR improves quality control by enabling objective identification of technical variations that may impact downstream analysis in high-parameter cytometry data. The tool uses a series of complementary metrics to identify potential batch-related problems at the marker and cell population level and presents the results through interpretable visualizations. This allows users to make informed decisions about whether to apply batch correction or exclude specific batches or markers from downstream analyses. CytoBatchFlagR is freely available as R scripts, with documentation and a tutorial to help users get started.

CytoBatchFlagR:一个全面的框架来客观评估批处理效应的高参数细胞计数数据。
质量和流式细胞术技术的快速发展使研究人员能够生成和分析高维单细胞数据集,通常使用40个以上的蛋白质标记。这种高参数细胞术越来越多地用于纵向免疫学研究,但在实验批次运行中的技术变化可能会混淆生物信号。为了减轻对下游分析的影响,许多研究在每次运行中都包含参考对照样本,并且存在几种方法来调整批次效应。然而,客观地识别数据集中存在的问题批次和标记的工具是有限的。我们介绍了CytoBatchFlagR,这是一个全面且可解释的工具,旨在标记标记物和细胞簇水平上的批次相关问题,基于稳健的统计评估。批和标记变化是根据阴性和阳性细胞群和阳性细胞频率的中位数信号强度,以及信号强度分布的地球移动距离(EMD)来评估的。此外,CytoBatchFlagR通过无监督聚类识别细胞类型特定的批问题。该工具适用于质量和流式细胞术数据集,它客观地检测不同类型的批问题。我们使用三个细胞计数数据集开发和测试了CytoBatchFlagR,以证明其实用性和性能。我们还证明了CytoBatchFlagR在评估包含或缺乏参考对照的数据集方面的有效性。CytoBatchFlagR通过客观识别可能影响高参数细胞术数据下游分析的技术变化,提高了质量控制。该工具使用一系列互补指标来识别标记物和细胞群水平上潜在的批处理相关问题,并通过可解释的可视化呈现结果。这允许用户做出明智的决定,是否应用批次校正或从下游分析中排除特定批次或标记。CytoBatchFlagR是免费的R脚本,有文档和教程来帮助用户入门。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cytometry Part A
Cytometry Part A 生物-生化研究方法
CiteScore
8.10
自引率
13.50%
发文量
183
审稿时长
4-8 weeks
期刊介绍: Cytometry Part A, the journal of quantitative single-cell analysis, features original research reports and reviews of innovative scientific studies employing quantitative single-cell measurement, separation, manipulation, and modeling techniques, as well as original articles on mechanisms of molecular and cellular functions obtained by cytometry techniques. The journal welcomes submissions from multiple research fields that fully embrace the study of the cytome: Biomedical Instrumentation Engineering Biophotonics Bioinformatics Cell Biology Computational Biology Data Science Immunology Parasitology Microbiology Neuroscience Cancer Stem Cells Tissue Regeneration.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书