Rethinking statistical approaches for serological data analysis for viral surveillance

IF 2.2 4区 医学 Q3 BIOCHEMICAL RESEARCH METHODS
Morgan P. Kain , Jonathan H. Epstein , Noam Ross
{"title":"Rethinking statistical approaches for serological data analysis for viral surveillance","authors":"Morgan P. Kain ,&nbsp;Jonathan H. Epstein ,&nbsp;Noam Ross","doi":"10.1016/j.jviromet.2025.115149","DOIUrl":null,"url":null,"abstract":"<div><div>A robust serological surveillance system for zoonotic pathogens is imperative for both early detection and advancing knowledge of emerging diseases. A statistical analysis plan that is aligned to research and epidemiological goals requires a purposeful choice among alternative methods for differentiating seronegative from seropositive samples, estimating seroprevalence, and estimating risk factors associated with seropositivity. The common standard deviation-based cutoff (e.g., 3sd) approach is simple to implement and understand, but fails to appropriately propagate uncertainty in serostatus assignments to any risk factor analysis. Methods such as Gaussian mixture models, which jointly estimate serostatus, risk factors, and their uncertainty, can alleviate the dichotomy created by the cutoff approach. Yet, because of a lack of empirical guidance of method performance, it remains difficult to choose a robust analysis method for a given serological dataset. Here we examine the performance of both cutoff and clustering approaches using simulated datasets that represent the epidemiological, biological, and immunological data generation process. We focus on understudied pathogens for which validated serological assays do not exist, as is common in emerging viruses in wildlife. We quantify coverage (the proportion of time 95 % confidence intervals contain the true value) and bias (systematic differences between true values and model point estimates) of model estimates for individual serostatus assignments, population seroprevalence, and regression coefficients for serostatus risk factors. In nearly all scenarios, Bayesian mixture models provide the highest coverage and lowest bias. Only with very low seroprevalence (∼ &lt; 3 %) and large differences in signal between seronegative and seropositive individuals will a cutoff provide low bias and near-nominal coverage. Given poor coverage of risk factor regression coefficients, we advise against using a cutoff approach for quantifying determinants of seropositivity.</div></div>","PeriodicalId":17663,"journal":{"name":"Journal of virological methods","volume":"335 ","pages":"Article 115149"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of virological methods","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166093425000424","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

A robust serological surveillance system for zoonotic pathogens is imperative for both early detection and advancing knowledge of emerging diseases. A statistical analysis plan that is aligned to research and epidemiological goals requires a purposeful choice among alternative methods for differentiating seronegative from seropositive samples, estimating seroprevalence, and estimating risk factors associated with seropositivity. The common standard deviation-based cutoff (e.g., 3sd) approach is simple to implement and understand, but fails to appropriately propagate uncertainty in serostatus assignments to any risk factor analysis. Methods such as Gaussian mixture models, which jointly estimate serostatus, risk factors, and their uncertainty, can alleviate the dichotomy created by the cutoff approach. Yet, because of a lack of empirical guidance of method performance, it remains difficult to choose a robust analysis method for a given serological dataset. Here we examine the performance of both cutoff and clustering approaches using simulated datasets that represent the epidemiological, biological, and immunological data generation process. We focus on understudied pathogens for which validated serological assays do not exist, as is common in emerging viruses in wildlife. We quantify coverage (the proportion of time 95 % confidence intervals contain the true value) and bias (systematic differences between true values and model point estimates) of model estimates for individual serostatus assignments, population seroprevalence, and regression coefficients for serostatus risk factors. In nearly all scenarios, Bayesian mixture models provide the highest coverage and lowest bias. Only with very low seroprevalence (∼ < 3 %) and large differences in signal between seronegative and seropositive individuals will a cutoff provide low bias and near-nominal coverage. Given poor coverage of risk factor regression coefficients, we advise against using a cutoff approach for quantifying determinants of seropositivity.
重新思考病毒监测血清学数据分析的统计方法。
一个强大的人畜共患病原体血清学监测系统对于早期检测和增进对新出现疾病的了解至关重要。要制定与研究和流行病学目标相一致的统计分析计划,就必须在区分血清阴性与血清阳性样本、估算血清流行率以及估算与血清阳性相关的风险因素的备选方法中做出有针对性的选择。常见的基于标准偏差的截止值(如 3sd)方法易于实施和理解,但无法将血清状态分配的不确定性恰当地传播到任何风险因素分析中。高斯混合物模型等方法可以联合估计血清状态、风险因素及其不确定性,从而缓解截止值方法造成的二分法。然而,由于缺乏对方法性能的经验指导,要为给定的血清学数据集选择一种可靠的分析方法仍然很困难。在此,我们使用代表流行病学、生物学和免疫学数据生成过程的模拟数据集来检验截止法和聚类法的性能。我们将重点放在未被充分研究的病原体上,因为这些病原体还不存在有效的血清学检测方法,这在野生动物中新出现的病毒中很常见。我们量化了个体血清状态分配、群体血清流行率和血清状态风险因素回归系数模型估计值的覆盖率(95% 置信区间包含真实值的时间比例)和偏差(真实值与模型点估计值之间的系统性差异)。几乎在所有情况下,贝叶斯混合模型都能提供最高的覆盖率和最低的偏差。只有在血清流行率非常低(~ < 3%)、血清阴性个体与血清阳性个体之间信号差异较大的情况下,临界值才能提供低偏差和接近名义的覆盖率。鉴于风险因素回归系数的覆盖率较低,我们建议不要使用临界值方法来量化血清阳性的决定因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.80
自引率
0.00%
发文量
209
审稿时长
41 days
期刊介绍: The Journal of Virological Methods focuses on original, high quality research papers that describe novel and comprehensively tested methods which enhance human, animal, plant, bacterial or environmental virology and prions research and discovery. The methods may include, but not limited to, the study of: Viral components and morphology- Virus isolation, propagation and development of viral vectors- Viral pathogenesis, oncogenesis, vaccines and antivirals- Virus replication, host-pathogen interactions and responses- Virus transmission, prevention, control and treatment- Viral metagenomics and virome- Virus ecology, adaption and evolution- Applied virology such as nanotechnology- Viral diagnosis with novelty and comprehensive evaluation. We seek articles, systematic reviews, meta-analyses and laboratory protocols that include comprehensive technical details with statistical confirmations that provide validations against current best practice, international standards or quality assurance programs and which advance knowledge in virology leading to improved medical, veterinary or agricultural practices and management.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信