Jun Wang , Yijia Zhao , Xiaoxia Li , Yu Zhou , Kaixuan Zhao , Hui Wang , Waleid Mohamed EL-Sayed Shakweer
{"title":"Multimodal fusion-based detection method of estrus cows using multisource data inspired by hidden Markov model algorithms","authors":"Jun Wang , Yijia Zhao , Xiaoxia Li , Yu Zhou , Kaixuan Zhao , Hui Wang , Waleid Mohamed EL-Sayed Shakweer","doi":"10.1016/j.compag.2025.110391","DOIUrl":null,"url":null,"abstract":"<div><div>The combination of multimodal features based on vocalization and behavioral traits has been proven to enhance the robustness and accuracy of estrus detection. However, challenges still need to be addressed to improve the reliability and practicality of the identification of estrus cows, including unclear association mechanisms between multi-feature estrus traits and complex estrus states, inadequate feature selection and fusion strategies, and limitations in algorithm performance. To cope with these difficulties, the Friedman test, Mantel test, Spearman rank correlation coefficient, Kruskal-Wallis test, and Canonical Correspondence Analysis (CCA) were employed to explore the complex interactions among high-dimensional estrus data. Moreover, the model interpretability framework based on the self-attention mechanism was constructed to reveal the importance of critical features and optimize feature combinations. In addition, the multimodal fusion approach integrating standardization processing, principal component analysis, Fourier transform, statistical indicators extraction, and semi-tensor product was designed to elevate the depth of representation and the fusion effect of multi-dimensional data by comprehensively analyzing the relationships among various features. Furthermore, a neural network-optimized Hidden Markov Model (NN-HMM) for estrus detection was proposed to promote the capability of estrus detection by overcoming the imperfections of traditional HMM in capturing long distance dependence effect in state transition matrices, describing complex feature relationship in emission probability matrices, and dynamic adaptability of optimal path generation. The experimental results demonstrated that the selection of optimal feature combination integrating acoustic and behavioral features (number of bellowings, number of lowings, number of consecutive bellowings, vocalization frequency, standing duration, lying duration, walking duration, feeding duration, activity index, number of standing mounts, number of social behaviors, and ruminating variation index) improved estrus detection accuracy by over 18 % compared to using suboptimal feature combinations. Meanwhile, compared with Support Vector Machine (SVM), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) models, the accuracy of the developed multimodal fusion-based detection method increased by 8.8 %, 7.4 %, and 6.1 %, while the precision enhanced by 7.4 %, 6.0 %, and 4.3 %, respectively. Blind testing conducted on multisource estrus data from 24 multiparous and 16 primiparous cows found that the proposed prediction method advanced the prediction of estrus onset by up to 85 and 40 mins for primiparous cows and up to 75 and 60 mins for multiparous cows, respectively, significantly surpassing conventional activity index and acoustic-based methods. Therefore, the method proposed in this study undoubtedly provides a reliable solution for timely and efficient detection of estrus in dairy farms.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"235 ","pages":"Article 110391"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925004971","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The combination of multimodal features based on vocalization and behavioral traits has been proven to enhance the robustness and accuracy of estrus detection. However, challenges still need to be addressed to improve the reliability and practicality of the identification of estrus cows, including unclear association mechanisms between multi-feature estrus traits and complex estrus states, inadequate feature selection and fusion strategies, and limitations in algorithm performance. To cope with these difficulties, the Friedman test, Mantel test, Spearman rank correlation coefficient, Kruskal-Wallis test, and Canonical Correspondence Analysis (CCA) were employed to explore the complex interactions among high-dimensional estrus data. Moreover, the model interpretability framework based on the self-attention mechanism was constructed to reveal the importance of critical features and optimize feature combinations. In addition, the multimodal fusion approach integrating standardization processing, principal component analysis, Fourier transform, statistical indicators extraction, and semi-tensor product was designed to elevate the depth of representation and the fusion effect of multi-dimensional data by comprehensively analyzing the relationships among various features. Furthermore, a neural network-optimized Hidden Markov Model (NN-HMM) for estrus detection was proposed to promote the capability of estrus detection by overcoming the imperfections of traditional HMM in capturing long distance dependence effect in state transition matrices, describing complex feature relationship in emission probability matrices, and dynamic adaptability of optimal path generation. The experimental results demonstrated that the selection of optimal feature combination integrating acoustic and behavioral features (number of bellowings, number of lowings, number of consecutive bellowings, vocalization frequency, standing duration, lying duration, walking duration, feeding duration, activity index, number of standing mounts, number of social behaviors, and ruminating variation index) improved estrus detection accuracy by over 18 % compared to using suboptimal feature combinations. Meanwhile, compared with Support Vector Machine (SVM), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) models, the accuracy of the developed multimodal fusion-based detection method increased by 8.8 %, 7.4 %, and 6.1 %, while the precision enhanced by 7.4 %, 6.0 %, and 4.3 %, respectively. Blind testing conducted on multisource estrus data from 24 multiparous and 16 primiparous cows found that the proposed prediction method advanced the prediction of estrus onset by up to 85 and 40 mins for primiparous cows and up to 75 and 60 mins for multiparous cows, respectively, significantly surpassing conventional activity index and acoustic-based methods. Therefore, the method proposed in this study undoubtedly provides a reliable solution for timely and efficient detection of estrus in dairy farms.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.