To bin or not to bin: why parasite abundance data should not be lumped into categories for statistical analysis.

IF 2.1 3区 医学 Q2 PARASITOLOGY
Robert Poulin
{"title":"To bin or not to bin: why parasite abundance data should not be lumped into categories for statistical analysis.","authors":"Robert Poulin","doi":"10.1017/S003118202500040X","DOIUrl":null,"url":null,"abstract":"<p><p>The impact of macroparasites on their hosts is proportional to the number of parasites per host, or parasite abundance. Abundance values are count data, i.e. integers ranging from 0 to some maximum number, depending on the host-parasite system. When using parasite abundance as a predictor in statistical analysis, a common approach is to bin values, i.e. group hosts into infection categories based on abundance, and test for differences in some response variable (e.g. a host trait) among these categories. There are well-documented pitfalls associated with this approach. Here, I use a literature review to show that binning abundance values for analysis has been used in one-third of studies published in parasitological journals over the past 15 years, and half of the studies in ecological and behavioural journals, often without any justification. Binning abundance data into arbitrary categories has been much more common among studies using experimental infections than among those using naturally infected hosts. I then use simulated data to demonstrate that true and significant relationships between parasite abundance and host traits can be missed when abundance values are binned for analysis, and vice versa that when there is no underlying relationship between abundance and host traits, analysis of binned data can create a spurious one. This holds regardless of the prevalence of infection or the level of parasite aggregation in a host sample. These findings argue strongly for the practice of binning abundance data as a predictor variable to be abandoned in favour of more appropriate analytical approaches.</p>","PeriodicalId":19967,"journal":{"name":"Parasitology","volume":" ","pages":"1-8"},"PeriodicalIF":2.1000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parasitology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1017/S003118202500040X","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PARASITOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The impact of macroparasites on their hosts is proportional to the number of parasites per host, or parasite abundance. Abundance values are count data, i.e. integers ranging from 0 to some maximum number, depending on the host-parasite system. When using parasite abundance as a predictor in statistical analysis, a common approach is to bin values, i.e. group hosts into infection categories based on abundance, and test for differences in some response variable (e.g. a host trait) among these categories. There are well-documented pitfalls associated with this approach. Here, I use a literature review to show that binning abundance values for analysis has been used in one-third of studies published in parasitological journals over the past 15 years, and half of the studies in ecological and behavioural journals, often without any justification. Binning abundance data into arbitrary categories has been much more common among studies using experimental infections than among those using naturally infected hosts. I then use simulated data to demonstrate that true and significant relationships between parasite abundance and host traits can be missed when abundance values are binned for analysis, and vice versa that when there is no underlying relationship between abundance and host traits, analysis of binned data can create a spurious one. This holds regardless of the prevalence of infection or the level of parasite aggregation in a host sample. These findings argue strongly for the practice of binning abundance data as a predictor variable to be abandoned in favour of more appropriate analytical approaches.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Parasitology
Parasitology 医学-寄生虫学
CiteScore
4.80
自引率
4.20%
发文量
280
审稿时长
3-8 weeks
期刊介绍: Parasitology is an important specialist journal covering the latest advances in the subject. It publishes original research and review papers on all aspects of parasitology and host-parasite relationships, including the latest discoveries in parasite biochemistry, molecular biology and genetics, ecology and epidemiology in the context of the biological, medical and veterinary sciences. Included in the subscription price are two special issues which contain reviews of current hot topics, one of which is the proceedings of the annual Symposia of the British Society for Parasitology, while the second, covering areas of significant topical interest, is commissioned by the editors and the editorial board.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信