{"title":"基于时间序列特征的奶牛乳腺炎预测模型研究。","authors":"Rui Guo, Yongqiang Dai, Junjie Hu","doi":"10.3389/fvets.2025.1575525","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Mastitis in dairy cows is a significant challenge faced by the global dairy industry, significantly affecting the quality and output of milk from dairy enterprises and causing them to suffer severe economic losses. With the increasing public concern over food safety and the rational use of antibiotics, how to identify cows at risk of disease early has become a key issue that needs to be urgently addressed. Especially subclinical mastitis, due to the lack of obvious external symptoms, makes detection more difficult, so early warning of it is particularly important.</p><p><strong>Methods: </strong>In this study, a time series prediction method, combined with machine learning techniques, was used to predict the risk of mastitis in dairy cows. The study data were obtained from the production records of 4000 dairy cows in a large farm in Hexi region of Gansu. By constructing time-series features, production indicators such as milk yield, fat rate and protein rate of each cow in two consecutive months, April and May, were utilized to predict its health status in June. To fully exploit the value of the time series features, we designed a multidimensional feature set that included raw indicator values, monthly change rates, and statistical features. After data preprocessing and sample balancing, data from 2821 cows were selected for model training. Finally, the applicability of each model was assessed by comparing and analyzing the prediction performance of six models, namely eXtreme Gradient Boosting(XGBoost), Gradient Boosting Decision Tree (GBDT), Support Vector Machine (SVM), K Nearest Neighbors (KNN), Logistic Regression, and Long Short-Term Memory Network (LSTM).</p><p><strong>Results: </strong>The XGBoost model demonstrated optimal performance, achieving an area under the ROC curve (AUC) of 0.75 with an accuracy rate of 71.36%. Feature importance analysis revealed three key temporal indicators significantly influencing prediction outcomes: May milk yield (22.29%), standard deviation of fat percentage (20.27%), and fat percentage change rate (19.87%). SHapley Additive exPlanations (SHAP) value analysis further validated the predictive value of these temporal features, providing dairy farm managers with clearly defined monitoring priorities.</p><p><strong>Discussion: </strong>The XGBoost model demonstrates strong potential as an accurate predictive tool for subclinical mastitis in dairy cows. This study presents an effective early-warning approach through time-series modeling that offers significant practical value for mastitis prevention in dairy farm management.</p>","PeriodicalId":12772,"journal":{"name":"Frontiers in Veterinary Science","volume":"12 ","pages":"1575525"},"PeriodicalIF":2.6000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12058853/pdf/","citationCount":"0","resultStr":"{\"title\":\"Research on the prediction model of mastitis in dairy cows based on time series characteristics.\",\"authors\":\"Rui Guo, Yongqiang Dai, Junjie Hu\",\"doi\":\"10.3389/fvets.2025.1575525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Mastitis in dairy cows is a significant challenge faced by the global dairy industry, significantly affecting the quality and output of milk from dairy enterprises and causing them to suffer severe economic losses. With the increasing public concern over food safety and the rational use of antibiotics, how to identify cows at risk of disease early has become a key issue that needs to be urgently addressed. Especially subclinical mastitis, due to the lack of obvious external symptoms, makes detection more difficult, so early warning of it is particularly important.</p><p><strong>Methods: </strong>In this study, a time series prediction method, combined with machine learning techniques, was used to predict the risk of mastitis in dairy cows. The study data were obtained from the production records of 4000 dairy cows in a large farm in Hexi region of Gansu. By constructing time-series features, production indicators such as milk yield, fat rate and protein rate of each cow in two consecutive months, April and May, were utilized to predict its health status in June. To fully exploit the value of the time series features, we designed a multidimensional feature set that included raw indicator values, monthly change rates, and statistical features. After data preprocessing and sample balancing, data from 2821 cows were selected for model training. Finally, the applicability of each model was assessed by comparing and analyzing the prediction performance of six models, namely eXtreme Gradient Boosting(XGBoost), Gradient Boosting Decision Tree (GBDT), Support Vector Machine (SVM), K Nearest Neighbors (KNN), Logistic Regression, and Long Short-Term Memory Network (LSTM).</p><p><strong>Results: </strong>The XGBoost model demonstrated optimal performance, achieving an area under the ROC curve (AUC) of 0.75 with an accuracy rate of 71.36%. Feature importance analysis revealed three key temporal indicators significantly influencing prediction outcomes: May milk yield (22.29%), standard deviation of fat percentage (20.27%), and fat percentage change rate (19.87%). SHapley Additive exPlanations (SHAP) value analysis further validated the predictive value of these temporal features, providing dairy farm managers with clearly defined monitoring priorities.</p><p><strong>Discussion: </strong>The XGBoost model demonstrates strong potential as an accurate predictive tool for subclinical mastitis in dairy cows. This study presents an effective early-warning approach through time-series modeling that offers significant practical value for mastitis prevention in dairy farm management.</p>\",\"PeriodicalId\":12772,\"journal\":{\"name\":\"Frontiers in Veterinary Science\",\"volume\":\"12 \",\"pages\":\"1575525\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12058853/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Veterinary Science\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.3389/fvets.2025.1575525\",\"RegionNum\":2,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"VETERINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Veterinary Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3389/fvets.2025.1575525","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"VETERINARY SCIENCES","Score":null,"Total":0}
Research on the prediction model of mastitis in dairy cows based on time series characteristics.
Introduction: Mastitis in dairy cows is a significant challenge faced by the global dairy industry, significantly affecting the quality and output of milk from dairy enterprises and causing them to suffer severe economic losses. With the increasing public concern over food safety and the rational use of antibiotics, how to identify cows at risk of disease early has become a key issue that needs to be urgently addressed. Especially subclinical mastitis, due to the lack of obvious external symptoms, makes detection more difficult, so early warning of it is particularly important.
Methods: In this study, a time series prediction method, combined with machine learning techniques, was used to predict the risk of mastitis in dairy cows. The study data were obtained from the production records of 4000 dairy cows in a large farm in Hexi region of Gansu. By constructing time-series features, production indicators such as milk yield, fat rate and protein rate of each cow in two consecutive months, April and May, were utilized to predict its health status in June. To fully exploit the value of the time series features, we designed a multidimensional feature set that included raw indicator values, monthly change rates, and statistical features. After data preprocessing and sample balancing, data from 2821 cows were selected for model training. Finally, the applicability of each model was assessed by comparing and analyzing the prediction performance of six models, namely eXtreme Gradient Boosting(XGBoost), Gradient Boosting Decision Tree (GBDT), Support Vector Machine (SVM), K Nearest Neighbors (KNN), Logistic Regression, and Long Short-Term Memory Network (LSTM).
Results: The XGBoost model demonstrated optimal performance, achieving an area under the ROC curve (AUC) of 0.75 with an accuracy rate of 71.36%. Feature importance analysis revealed three key temporal indicators significantly influencing prediction outcomes: May milk yield (22.29%), standard deviation of fat percentage (20.27%), and fat percentage change rate (19.87%). SHapley Additive exPlanations (SHAP) value analysis further validated the predictive value of these temporal features, providing dairy farm managers with clearly defined monitoring priorities.
Discussion: The XGBoost model demonstrates strong potential as an accurate predictive tool for subclinical mastitis in dairy cows. This study presents an effective early-warning approach through time-series modeling that offers significant practical value for mastitis prevention in dairy farm management.
期刊介绍:
Frontiers in Veterinary Science is a global, peer-reviewed, Open Access journal that bridges animal and human health, brings a comparative approach to medical and surgical challenges, and advances innovative biotechnology and therapy.
Veterinary research today is interdisciplinary, collaborative, and socially relevant, transforming how we understand and investigate animal health and disease. Fundamental research in emerging infectious diseases, predictive genomics, stem cell therapy, and translational modelling is grounded within the integrative social context of public and environmental health, wildlife conservation, novel biomarkers, societal well-being, and cutting-edge clinical practice and specialization. Frontiers in Veterinary Science brings a 21st-century approach—networked, collaborative, and Open Access—to communicate this progress and innovation to both the specialist and to the wider audience of readers in the field.
Frontiers in Veterinary Science publishes articles on outstanding discoveries across a wide spectrum of translational, foundational, and clinical research. The journal''s mission is to bring all relevant veterinary sciences together on a single platform with the goal of improving animal and human health.