Multimodal distribution and its impact on the accurate assessment of spermatozoa morphological data: Lessons from machine learning

IF 2.2 2区 农林科学 Q1 AGRICULTURE, DAIRY & ANIMAL SCIENCE
{"title":"Multimodal distribution and its impact on the accurate assessment of spermatozoa morphological data: Lessons from machine learning","authors":"","doi":"10.1016/j.anireprosci.2024.107564","DOIUrl":null,"url":null,"abstract":"<div><div><span><span><span>Objective assessment of sperm morphology is an essential component for assessing ejaculate quality. Due to economic limitations, investigators often divert to conducting observational studies instead of experimental ones, which provide the strongest statistical power, yielding more heterogeneous data regardless of the number of </span>data sources (barns/farms). Using such data inevitably leads to higher variances of estimates, which negatively impacts the statistical power of a study. In this article, we describe a statistical methodology called finite mixture modeling (FMM), which, based on the supplied data and assumed number of sub-classes, classifies the data into two or more homogeneous types of distributions and determines their fractional size relative to the entire cohort. The goal is to use statistical methods that will confound the variance of the sample. A figure from a previous publication was used to generate </span>simulated data (n=1559) on the cytoplasmic droplet rate. We identified that a bi-modal distribution with two latent classes best described the simulated data. </span><em>Post-hoc</em><span> estimation showed that about 80 % of observations belonged to latent class 1, with 20 % in latent class 2. The FMM methodology identified a cutoff point of 8.7 %. Finally, when estimating the standard error for the total cohort, the FMM methodology yielded a 40 % reduction in the standard error compared to standard methodologies. In conclusion, here we show that FMM successfully confounded the variance of the data and, as such, yielded lower estimates of the variance than standard methodologies, increasing the statistical power of the cohort.</span></div></div>","PeriodicalId":7880,"journal":{"name":"Animal Reproduction Science","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Reproduction Science","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378432024001556","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, DAIRY & ANIMAL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Objective assessment of sperm morphology is an essential component for assessing ejaculate quality. Due to economic limitations, investigators often divert to conducting observational studies instead of experimental ones, which provide the strongest statistical power, yielding more heterogeneous data regardless of the number of data sources (barns/farms). Using such data inevitably leads to higher variances of estimates, which negatively impacts the statistical power of a study. In this article, we describe a statistical methodology called finite mixture modeling (FMM), which, based on the supplied data and assumed number of sub-classes, classifies the data into two or more homogeneous types of distributions and determines their fractional size relative to the entire cohort. The goal is to use statistical methods that will confound the variance of the sample. A figure from a previous publication was used to generate simulated data (n=1559) on the cytoplasmic droplet rate. We identified that a bi-modal distribution with two latent classes best described the simulated data. Post-hoc estimation showed that about 80 % of observations belonged to latent class 1, with 20 % in latent class 2. The FMM methodology identified a cutoff point of 8.7 %. Finally, when estimating the standard error for the total cohort, the FMM methodology yielded a 40 % reduction in the standard error compared to standard methodologies. In conclusion, here we show that FMM successfully confounded the variance of the data and, as such, yielded lower estimates of the variance than standard methodologies, increasing the statistical power of the cohort.
多模态分布及其对精子形态数据准确评估的影响:机器学习的启示。
对精子形态进行客观评估是评估射精质量的重要组成部分。由于经济条件的限制,研究人员往往转而进行观察研究,而不是实验研究。观察研究可提供最强的统计能力,无论数据来源(畜舍/农场)的数量如何,都会产生更多的异质性数据。使用这些数据不可避免地会导致估计值的方差增大,从而对研究的统计能力产生负面影响。在本文中,我们将介绍一种称为有限混合物建模(FMM)的统计方法,该方法根据所提供的数据和假定的子类数量,将数据分为两种或两种以上的同质分布类型,并确定它们相对于整个群组的比例大小。目的是使用能混淆样本方差的统计方法。我们使用以前发表的一张图来生成细胞质液滴率的模拟数据(n=1559)。我们发现,具有两个潜在类别的双模态分布最能描述模拟数据。事后估计显示,约 80% 的观测结果属于潜类 1,20% 属于潜类 2。FMM 方法确定的分界点为 8.7%。最后,在估计整个群体的标准误差时,与标准方法相比,FMM 方法的标准误差减少了 40%。总之,我们在此表明,FMM 成功地混淆了数据的方差,因此,其方差估计值低于标准方法,从而提高了队列的统计能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Animal Reproduction Science
Animal Reproduction Science 农林科学-奶制品与动物科学
CiteScore
4.50
自引率
9.10%
发文量
136
审稿时长
54 days
期刊介绍: Animal Reproduction Science publishes results from studies relating to reproduction and fertility in animals. This includes both fundamental research and applied studies, including management practices that increase our understanding of the biology and manipulation of reproduction. Manuscripts should go into depth in the mechanisms involved in the research reported, rather than a give a mere description of findings. The focus is on animals that are useful to humans including food- and fibre-producing; companion/recreational; captive; and endangered species including zoo animals, but excluding laboratory animals unless the results of the study provide new information that impacts the basic understanding of the biology or manipulation of reproduction. The journal''s scope includes the study of reproductive physiology and endocrinology, reproductive cycles, natural and artificial control of reproduction, preservation and use of gametes and embryos, pregnancy and parturition, infertility and sterility, diagnostic and therapeutic techniques. The Editorial Board of Animal Reproduction Science has decided not to publish papers in which there is an exclusive examination of the in vitro development of oocytes and embryos; however, there will be consideration of papers that include in vitro studies where the source of the oocytes and/or development of the embryos beyond the blastocyst stage is part of the experimental design.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信