Kaixuan Cuan , Feiyue Hu , Xiaoshuai Wang , Xiaojie Yan , Yanchao Wang , Kaiying Wang
{"title":"Automatic body temperature detection of group-housed piglets based on infrared and visible image fusion","authors":"Kaixuan Cuan , Feiyue Hu , Xiaoshuai Wang , Xiaojie Yan , Yanchao Wang , Kaiying Wang","doi":"10.1016/j.aiia.2025.06.008","DOIUrl":"10.1016/j.aiia.2025.06.008","url":null,"abstract":"<div><div>Rapid and accurate measurement of body temperature is essential for early disease detection, as it is a key indicator of piglet health. Infrared thermography (IRT) is a widely used, convenient, non-intrusive, and efficient non-contact temperature measurement technology. However, the activities and clustering of group-housed piglets make it challenging to measure the individual body temperature using IRT. This study proposes a method for detecting body temperature in group-housed piglets using infrared-visible image fusion. The infrared and visible images were automatically captured by cameras mounted on a robot. An improved YOLOv8-PT model was proposed to detect both piglets and their key body regions (ears, abdomen and hip) in visible images. Subsequently, the Oriented FAST and Rotated BRIEF (ORB) image registration method and the U2Fusion image fusion network were employed to extract temperatures from the detected body parts. Finally, a core body temperature (CBT) estimation model was developed, with actual rectal temperature serving as the gold standard. The temperatures of three body parts detected by infrared thermography were used to estimate CBT, and the maximum estimated temperature based on these body parts (EBT-Max) was selected as the final result. In the experiment, the YOLOv8-PT model achieved a [email protected] of 93.6 %, precision of 93.3 %, recall of 88.9 %, and F1 score of 91.05 %. The average detection time per image was 4.3 ms, enabling real-time detection. Additionally, the mean absolute errors (MAE) and correlation coefficient between EBT-Max and actual rectal temperature is 0.40 °C and 0.6939, respectively. Therefore, this method provides a feasible and efficient approach for group-housed piglets body temperature detection and offers a reference for the development of automated pig health monitoring systems.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"16 1","pages":"Pages 1-11"},"PeriodicalIF":8.2,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144595478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangyu Zhao , Fuzhen Sun , Jinlong Li , Dongfeng Zhang , Qiusi Zhang , Zhongqiang Liu , Changwei Tan , Hongxiang Ma , Kaiyi Wang
{"title":"VMGP: A unified variational auto-encoder based multi-task model for multi-phenotype, multi-environment, and cross-population genomic selection in plants","authors":"Xiangyu Zhao , Fuzhen Sun , Jinlong Li , Dongfeng Zhang , Qiusi Zhang , Zhongqiang Liu , Changwei Tan , Hongxiang Ma , Kaiyi Wang","doi":"10.1016/j.aiia.2025.06.007","DOIUrl":"10.1016/j.aiia.2025.06.007","url":null,"abstract":"<div><div>Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security. The advent of Genomic Selection heralds a new epoch in breeding, characterized by its capacity to harness whole-genome variation for genomic prediction. This approach transcends the need for prior knowledge of genes associated with specific traits. Nonetheless, the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the “curse of dimensionality”, where traditional statistical, machine learning, and deep learning methods are prone to overfitting and suboptimal predictive performance. To surmount this challenge, we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model (VMGP) that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks. This approach provides a robust solution, offering a formidable predictive framework that has been rigorously validated across public datasets for wheat, rice, and maize. Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction, successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility. Furthermore, by integrating VMGP with model interpretability, we can effectively triage relevant single nucleotide polymorphisms, thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions. The VMGP framework, with its simplicity, stable predictive prowess, and open-source code, is exceptionally well-suited for broad dissemination within plant breeding programs. It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 829-842"},"PeriodicalIF":8.2,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recognizing and localizing chicken behaviors in videos based on spatiotemporal feature learning","authors":"Yilei Hu , Jinyang Xu , Zhichao Gou , Di Cui","doi":"10.1016/j.aiia.2025.06.006","DOIUrl":"10.1016/j.aiia.2025.06.006","url":null,"abstract":"<div><div>Timely acquisition of chicken behavioral information is crucial for assessing chicken health status and production performance. Video-based behavior recognition has emerged as a primary technique for obtaining such information due to its accuracy and robustness. Video-based models generally predict a single behavior from a single video segment of a fixed duration. However, during periods of high activity in poultry, behavior transition may occur within a video segment, and existing models often fail to capture such transitions effectively. This limitation highlights the insufficient temporal resolution of video-based behavior recognition models. This study presents a chicken behavior recognition and localization model, CBLFormer, which is based on spatiotemporal feature learning. The model was designed to recognize behaviors that occur before and after transitions in video segments and to localize the corresponding time interval for each behavior. An improved transformer block, the cascade encoder-decoder network (CEDNet), a transformer-based head, and weighted distance intersection over union (WDIoU) loss were integrated into CBLFormer to enhance the model's ability to distinguish between different behavior categories and locate behavior boundaries. For the training and testing of CBLFormer, a dataset was created by collecting videos from 320 chickens across different ages and rearing densities. The results showed that CBLFormer achieved a [email protected]:0.95 of 98.34 % on the test set. The integration of CEDNet contributed the most to the performance improvement of CBLFormer. The visualization results confirmed that the model effectively captured the behavioral boundaries of chickens and correctly recognized behavior categories. The transfer learning results demonstrated that the model is applicable to chicken behavior recognition and localization tasks in real-world poultry farms. The proposed method handles cases where poultry behavior transitions occur within the video segment and improves the temporal resolution of video-based behavior recognition models.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 816-828"},"PeriodicalIF":8.2,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144480996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoyuan Zhou , Wenhao Ye , Sheng Li , Jian Zhao , Zhiwen Wang , Guoliang Li , Jiawei Li
{"title":"FGPointKAN++ point cloud segmentation and adaptive key cutting plane recognition for cow body size measurement","authors":"Guoyuan Zhou , Wenhao Ye , Sheng Li , Jian Zhao , Zhiwen Wang , Guoliang Li , Jiawei Li","doi":"10.1016/j.aiia.2025.06.003","DOIUrl":"10.1016/j.aiia.2025.06.003","url":null,"abstract":"<div><div>Accurate and efficient body size measurement is essential for health assessment and production management in modern animal husbandry. In order to realize the segmentation of the point clouds at the pixel-level and the accurate calculation of body size for the dairy cows in different postures, a segmentation model (FGPointKAN++) and an adaptive key cutting plane recognition (AKCPR) model are developed. FGPointKAN++ introduces FGE module and KAN that enhance local feature extraction and geometric consistency, significantly improving dairy cow part segmentation accuracy. The AKCPR utilizes adaptive plane fitting and dynamic orientation calibration to optimize the key body size measurement. The dairy cow body size parameters are then calculated based on the plane geometry features. The experimental results show that mIoU scores of 82.92 % and 83.24 % for the dairy cow pixel-level point cloud segmentation results. The calculated Mean Absolute Percentage Errors (MAPE) of Wither Height (WH), Body Width (BW), Chest Circumference (CC) and Abdominal Circumference (AC) are 2.07 %, 3.56 %, 2.24 % and 1.42 %, respectively. This method enables precise segmentation and automatic body size measurement of dairy cows in various walking postures, showing considerable potential for practical applications. It provides technical support for unmanned, intelligent, and precision farming, thereby enhancing animal welfare and improving economic efficiency.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 783-801"},"PeriodicalIF":8.2,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144365220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shangyuan Xie , Jiawei Shi , Wen Li , Tao Luo , Weikun Li , Lingfeng Duan , Peng Song , Xiyan Yang , Baoqi Li , Wanneng Yang
{"title":"EU-GAN: A root inpainting network for improving 2D soil-cultivated root phenotyping","authors":"Shangyuan Xie , Jiawei Shi , Wen Li , Tao Luo , Weikun Li , Lingfeng Duan , Peng Song , Xiyan Yang , Baoqi Li , Wanneng Yang","doi":"10.1016/j.aiia.2025.06.004","DOIUrl":"10.1016/j.aiia.2025.06.004","url":null,"abstract":"<div><div>Beyond its fundamental roles in nutrient uptake and plant anchorage, the root system critically influences crop development and stress tolerance. Rhizobox enables in situ and nondestructive phenotypic detection of roots in soil, serving as a cost-effective root imaging method. However, the opacity of the soil often results in intermittent gaps in the root images, which reduces the accuracy of the root phenotype calculations. We present a root inpainting method built upon Generative Adversarial Networks (GANs) architecture In addition, we built a hybrid root inpainting dataset (HRID) that contains 1206 cotton root images with real gaps and 7716 rice root images with generated gaps. Compared with computer simulation root images, our dataset provides real root system architecture (RSA) and root texture information. Our method avoids cropping during training by instead utilizing downsampled images to provide the overall root morphology. The model is trained using binary cross-entropy loss to distinguish between root and non-root pixels. Additionally, Dice loss is employed to mitigate the challenge of imbalanced data distribution Additionally, we remove the skip connections in U-Net and introduce an edge attention module (EAM) to capture more detailed information. Compared with other methods, our approach significantly improves the recall rate from 17.35 % to 35.75 % on the test dataset of 122 cotton root images, revealing improved inpainting capabilities. The trait error reduction rates (TERRs) for the root area, root length, convex hull area, and root depth are 76.07 %, 68.63 %, 48.64 %, and 88.28 %, respectively, enabling a substantial improvement in the accuracy of root phenotyping. The codes for the EU-GAN and the 8922 labeled images are open-access, which could be reused by researchers in other AI-related work. This method establishes a robust solution for root phenotyping, thereby increasing breeding program efficiency and advancing our understanding of root system dynamics.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 770-782"},"PeriodicalIF":8.2,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144307336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anran Song , Xinyu Guo , Weiliang Wen , Chuanyu Wang , Shenghao Gu , Xiaoqian Chen , Juan Wang , Chunjiang Zhao
{"title":"Improving accuracy and generalization in single kernel oil characteristics prediction in maize using NIR-HSI and a knowledge-injected spectral tabtransformer","authors":"Anran Song , Xinyu Guo , Weiliang Wen , Chuanyu Wang , Shenghao Gu , Xiaoqian Chen , Juan Wang , Chunjiang Zhao","doi":"10.1016/j.aiia.2025.05.007","DOIUrl":"10.1016/j.aiia.2025.05.007","url":null,"abstract":"<div><div>Near-infrared spectroscopy hyperspectral imaging (NIR-HSI) is widely used for seed component prediction due to its non-destructive and rapid nature. However, existing models often suffer from limited generalization, particularly when trained on small datasets, and there is a lack of effective deep learning (DL) models for spectral data analysis. To address these challenges, we propose the Knowledge-Injected Spectral TabTransformer (KIT-Spectral TabTransformer), an innovative adaptation of the traditional TabTransformer specifically designed for maize seeds. By integrating domain-specific knowledge, this approach enhances model training efficiency and predictive accuracy while reducing reliance on large datasets. The generalization capability of the model was rigorously validated through ten-fold cross-validation (10-CV). Compared to traditional machine learning methods, the attention-based CNN (ACNNR), and the Oil Characteristics Predictor Transformer (OCP-Transformer), the KIT-Spectral TabTransformer demonstrated superior performance in oil mass prediction, achieving <span><math><msubsup><mi>R</mi><mi>p</mi><mn>2</mn></msubsup></math></span>= 0.9238 ± 0.0346, RMSE<sub>p</sub> = 0.1746 ± 0.0401. For oil content, <span><math><msubsup><mi>R</mi><mi>p</mi><mn>2</mn></msubsup></math></span>= 0.9602 ± 0.0180 and RMSE<sub>p</sub> = 0.5301 ± 0.1446 on a dataset with oil content ranging from 0.81 % to 13.07 %. On the independent validation set, our model achieved <span><math><msup><mi>R</mi><mn>2</mn></msup></math></span> values of 0.7820 and 0.7586, along with RPD values of 2.1420 and 2.0355 in the two tasks, highlighting its strong prediction capability and potential for real-world application. These findings offer a potential method and direction for single seed oil prediction and related crop component analysis.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 802-815"},"PeriodicalIF":8.2,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Na Wu , Pan Gao , Jie Wu , Yun Zhao , Xing Xu , Chu Zhang , Erik Alexandersson , Juan Yang , Qinlin Xiao , Yong He
{"title":"Rapid detection and visualization of physiological signatures in cotton leaves under Verticillium wilt stress","authors":"Na Wu , Pan Gao , Jie Wu , Yun Zhao , Xing Xu , Chu Zhang , Erik Alexandersson , Juan Yang , Qinlin Xiao , Yong He","doi":"10.1016/j.aiia.2025.06.002","DOIUrl":"10.1016/j.aiia.2025.06.002","url":null,"abstract":"<div><div>Verticillium wilt poses a severe threat to cotton growth and significantly impacts cotton yield. It is of significant importance to detect Verticillium wilt stress in time. In this study, the effects of Verticillium wilt stress on the microstructure and physiological indicators (SOD, POD, CAT, MDA, Chl<sub>a</sub>, Chl<sub>b</sub>, Chl<sub>ab</sub>, Car) of cotton leaves were investigated, and the feasibility of utilizing hyperspectral imaging to estimate physiological indicators of cotton leaves was explored. The results showed that Verticillium wilt stress-induced alterations in cotton leaf cell morphology, leading to the disruption and decomposition of chloroplasts and mitochondria. In addition, compared to healthy leaves, infected leaves exhibited significantly higher activities of SOD and POD, along with increased MDA amounts, while chlorophyll and carotenoid levels were notably reduced. Furthermore, rapid detection models for cotton physiological indicators were constructed, with the <em>R</em><sub><em>p</em></sub> of the optimal models ranging from 0.809 to 0.975. Based on these models, visual distribution maps of the physiological signatures across cotton leaves were created. These results indicated that the physiological phenotype of cotton leaves could be effectively detected by hyperspectral imaging, which could provide a solid theoretical basis for the rapid detection of Verticillium wilt stress.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 757-769"},"PeriodicalIF":8.2,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144263501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Fahad Nasir , Alvaro Fuentes , Shujie Han , Jiaqi Liu , Yongchae Jeong , Sook Yoon , Dong Sun Park
{"title":"Multi-camera fusion and bird-eye view location mapping for deep learning-based cattle behavior monitoring","authors":"Muhammad Fahad Nasir , Alvaro Fuentes , Shujie Han , Jiaqi Liu , Yongchae Jeong , Sook Yoon , Dong Sun Park","doi":"10.1016/j.aiia.2025.06.001","DOIUrl":"10.1016/j.aiia.2025.06.001","url":null,"abstract":"<div><div>Cattle behavioral monitoring is an integral component of the modern infrastructure of the livestock industry. Ensuring cattle well-being requires precise observation, typically using wearable devices or surveillance cameras. Integrating deep learning into these systems enhances the monitoring of cattle behavior. However, challenges remain, such as occlusions, pose variations, and limited camera viewpoints, which hinder accurate detection and location mapping of individual cattle. To address these challenges, this paper proposes a multi-viewpoint surveillance system for indoor cattle barns, using footage from four cameras and deep learning-based models including action detection and pose estimation for behavior monitoring. The system accurately detects hierarchical behaviors across camera viewpoints. These results are fed into a Bird's Eye View (BEV) algorithm, producing precise cattle position maps in the barn. Despite complexities like overlapping and non-overlapping camera regions, our system, implemented on a real farm, ensures accurate cattle detection and BEV-based projections in real-time. Detailed experiments validate the system's efficiency, offering an end-to-end methodology for accurate behavior detection and location mapping of individual cattle using multi-camera data.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 724-743"},"PeriodicalIF":8.2,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144263581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongda Li , Huarui Wu , Qingxue Li , Chunjiang Zhao
{"title":"A review on enhancing agricultural intelligence with large language models","authors":"Hongda Li , Huarui Wu , Qingxue Li , Chunjiang Zhao","doi":"10.1016/j.aiia.2025.05.006","DOIUrl":"10.1016/j.aiia.2025.05.006","url":null,"abstract":"<div><div>This paper systematically explores the application potential of large language models (LLMs) in the field of agricultural intelligence, focusing on key technologies and practical pathways. The study focuses on the adaptation of LLMs to agricultural knowledge, starting with foundational concepts such as architecture design, pre-training strategies, and fine-tuning techniques, to build a technical framework for knowledge integration in the agricultural domain. Using tools such as vector databases and knowledge graphs, the study enables the structured development of professional agricultural knowledge bases. Additionally, by combining multimodal learning and intelligent question-answering (Q&A) system design, it validates the application value of LLMs in agricultural knowledge services. Addressing core challenges in domain adaptation, including knowledge acquisition and integration, logical reasoning, multimodal data processing, agent collaboration, and dynamic knowledge updating, the paper proposes targeted solutions. The study further explores the innovative applications of LLMs in scenarios such as precision crop management and market dynamics analysis, providing theoretical support and technical pathways for the development of agricultural intelligence. Through the technological innovation of large language models and their deep integration with the agricultural sector, the intelligence level of agricultural production, decision-making, and services can be effectively enhanced.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 671-685"},"PeriodicalIF":8.2,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144254489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang
{"title":"MSNet: A multispectral-image driven rapeseed canopy instance segmentation network","authors":"Yuang Yang, Xiaole Wang, Fugui Zhang, Zhenchao Wu, Yu Wang, Yujie Liu, Xuan Lv, Bowen Luo, Liqing Chen, Yang Yang","doi":"10.1016/j.aiia.2025.05.008","DOIUrl":"10.1016/j.aiia.2025.05.008","url":null,"abstract":"<div><div>Precise detection of rapeseed and the growth of its canopy area are crucial phenotypic indicators of its growth status. Achieving accurate identification of the rapeseed target and its growth region provides significant data support for phenotypic analysis and breeding research. However, in natural field environments, rapeseed detection remains a substantial challenge due to the limited feature representation capabilities of RGB-only modalities. To address this challenge, this study proposes a dual-modal instance segmentation network, MSNet, based on YOLOv11n-seg, integrating both RGB and Near-Infrared (NIR) modalities. The main improvements of this network include three different fusion location strategies (frontend fusion, mid-stage fusion, and backend fusion) and the newly introduced Hierarchical Attention Fusion Block (HAFB) for multimodal feature fusion. Comparative experiments on fusion locations indicate that the mid-stage fusion strategy achieves the best balance between detection accuracy and parameter efficiency. Compared to the baseline network, the <em>mAP50:95</em> improvement can reach up to 3.5 %. After introducing the HAFB module, the MSNet-H-HAFB model demonstrates a 6.5 % increase in <em>mAP50:95</em> relative to the baseline network, with less than a 38 % increase in parameter count. It is noteworthy that the mid-stage fusion consistently delivered the best detection performance in all experiments, providing clear design guidance for selecting fusion locations in future multimodal networks. In addition, comparisons with various RGB-only instance segmentation models show that all the proposed MSNet-HAFB fusion models significantly outperform single-modal models in rapeseed count detection tasks, confirming the potential advantages of multispectral fusion strategies in agricultural target recognition. Finally, the MSNet was applied in an agricultural case study, including vegetation index level analysis and frost damage classification. The results show that ZN6–2836 and ZS11 were predicted as potential superior varieties, and the EVI2 vegetation index achieved the best performance in rapeseed frost damage classification.</div></div>","PeriodicalId":52814,"journal":{"name":"Artificial Intelligence in Agriculture","volume":"15 4","pages":"Pages 642-658"},"PeriodicalIF":8.2,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144231062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}