Ameer A Megahed, Reddy Bommineni, Michael Short, Klibs N Galvão, João H J Bittar
{"title":"Using supervised machine learning algorithms to predict bovine leukemia virus seropositivity in dairy cattle in Florida: A 10-year retrospective study.","authors":"Ameer A Megahed, Reddy Bommineni, Michael Short, Klibs N Galvão, João H J Bittar","doi":"10.1016/j.prevetmed.2024.106387","DOIUrl":null,"url":null,"abstract":"<p><p>Supervised machine-learning (SML) algorithms are potentially powerful tools that may be used for screening cows for infectious diseases such as bovine leukemia virus (BLV) infection. Here, we compared six different SML algorithms to identify the most important risk factors for predicting BLV seropositivity in dairy cattle in Florida. We used a dataset of 1279 dairy blood sample records from the Bronson Animal Disease Diagnostic Laboratory that were submitted for BLV antibody testing from 2012 to 2022. The SML algorithms that we used were logistic regression (LR), decision tree (DT), gradient boosting (GB), random forest (RF), neural network (NN), and support vector machine (SVM). A total of 312 serum samples were positive for BLV with corrected seroprevalence of 26.0 %. Subject to limitations of the analyzed retrospective data, the RF model was the best model for predicting BLV seropositivity in dairy cattle indicated by the highest Kolmogorov-Smirnov (KS) statistic of 0.75, area under the receiver operating characteristic (AUROC) of 0.93, gain of 2.6; and lowest misclassification rate of 0.10. The LR model was the worst. The RF model showed that the best predictors for BLV seropositivity were age (dairy cows of age ≥ 5 years) and geographic location (southern Florida). We concluded that the RF and other SML algorithms hold promise for predicting BLV seropositivity in dairy cattle and that dairy cattle 5 years of age or older raised in southern Florida have a higher likelihood of testing positive for BLV. This study makes an important methodological contribution to the needed development of predictive tools for effective screening for BLV infection and emphasizes the importance of collecting and using representative data in such predictive models.</p>","PeriodicalId":20413,"journal":{"name":"Preventive veterinary medicine","volume":"235 ","pages":"106387"},"PeriodicalIF":2.2000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Preventive veterinary medicine","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1016/j.prevetmed.2024.106387","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"VETERINARY SCIENCES","Score":null,"Total":0}
Using supervised machine learning algorithms to predict bovine leukemia virus seropositivity in dairy cattle in Florida: A 10-year retrospective study.
Supervised machine-learning (SML) algorithms are potentially powerful tools that may be used for screening cows for infectious diseases such as bovine leukemia virus (BLV) infection. Here, we compared six different SML algorithms to identify the most important risk factors for predicting BLV seropositivity in dairy cattle in Florida. We used a dataset of 1279 dairy blood sample records from the Bronson Animal Disease Diagnostic Laboratory that were submitted for BLV antibody testing from 2012 to 2022. The SML algorithms that we used were logistic regression (LR), decision tree (DT), gradient boosting (GB), random forest (RF), neural network (NN), and support vector machine (SVM). A total of 312 serum samples were positive for BLV with corrected seroprevalence of 26.0 %. Subject to limitations of the analyzed retrospective data, the RF model was the best model for predicting BLV seropositivity in dairy cattle indicated by the highest Kolmogorov-Smirnov (KS) statistic of 0.75, area under the receiver operating characteristic (AUROC) of 0.93, gain of 2.6; and lowest misclassification rate of 0.10. The LR model was the worst. The RF model showed that the best predictors for BLV seropositivity were age (dairy cows of age ≥ 5 years) and geographic location (southern Florida). We concluded that the RF and other SML algorithms hold promise for predicting BLV seropositivity in dairy cattle and that dairy cattle 5 years of age or older raised in southern Florida have a higher likelihood of testing positive for BLV. This study makes an important methodological contribution to the needed development of predictive tools for effective screening for BLV infection and emphasizes the importance of collecting and using representative data in such predictive models.
期刊介绍:
Preventive Veterinary Medicine is one of the leading international resources for scientific reports on animal health programs and preventive veterinary medicine. The journal follows the guidelines for standardizing and strengthening the reporting of biomedical research which are available from the CONSORT, MOOSE, PRISMA, REFLECT, STARD, and STROBE statements. The journal focuses on:
Epidemiology of health events relevant to domestic and wild animals;
Economic impacts of epidemic and endemic animal and zoonotic diseases;
Latest methods and approaches in veterinary epidemiology;
Disease and infection control or eradication measures;
The "One Health" concept and the relationships between veterinary medicine, human health, animal-production systems, and the environment;
Development of new techniques in surveillance systems and diagnosis;
Evaluation and control of diseases in animal populations.