Wenlong Yi , Shuokang Xia , Sergey Kuzmin , Igor Gerasimov , Xiangping Cheng
{"title":"RTFVE-YOLOv9:结合YOLOv9和双目立体视觉的水果体积实时估计模型","authors":"Wenlong Yi , Shuokang Xia , Sergey Kuzmin , Igor Gerasimov , Xiangping Cheng","doi":"10.1016/j.compag.2025.110401","DOIUrl":null,"url":null,"abstract":"<div><div>This study proposes a real-time fruit volume estimation model based on YOLOv9 (RTFVE-YOLOv9) and binocular stereo vision technology to address the challenges of low automation and insufficient accuracy in fruit volume measurement in complex orchard environments, particularly in scenarios with diverse canopy structures and severe branch-leaf occlusion. The model achieves effective recognition of occluded fruits through the innovative design of a Dual-Scale and Global–Local Sequence (DSGLSeq) module while incorporating a Multi-Head and Multi-Scale Self-Interaction (MHMSI) module to improve the detection performance of small fruit targets. Systematic validation experiments conducted on major economic fruit tree varieties, including apples, pears, pomelos, and kiwifruit, demonstrate that RTFVE-YOLOv9 improved the mean Average Precision (mAP) by 2.1%, 1.6%, 4%, and 3.8% respectively on the four fruit datasets compared to the baseline YOLOv9-c model. The model’s internal working mechanisms were thoroughly revealed through multi-dimensional evaluation, including ablation experiments, Heatmap Analysis, and Effective Receptive Field (ERF) analysis, providing a theoretical foundation for subsequent optimization. The research findings enrich the application theory of computer vision in smart agriculture and provide reliable technical support for achieving precise orchard management.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"236 ","pages":"Article 110401"},"PeriodicalIF":7.7000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RTFVE-YOLOv9: Real-time fruit volume estimation model integrating YOLOv9 and binocular stereo vision\",\"authors\":\"Wenlong Yi , Shuokang Xia , Sergey Kuzmin , Igor Gerasimov , Xiangping Cheng\",\"doi\":\"10.1016/j.compag.2025.110401\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study proposes a real-time fruit volume estimation model based on YOLOv9 (RTFVE-YOLOv9) and binocular stereo vision technology to address the challenges of low automation and insufficient accuracy in fruit volume measurement in complex orchard environments, particularly in scenarios with diverse canopy structures and severe branch-leaf occlusion. The model achieves effective recognition of occluded fruits through the innovative design of a Dual-Scale and Global–Local Sequence (DSGLSeq) module while incorporating a Multi-Head and Multi-Scale Self-Interaction (MHMSI) module to improve the detection performance of small fruit targets. Systematic validation experiments conducted on major economic fruit tree varieties, including apples, pears, pomelos, and kiwifruit, demonstrate that RTFVE-YOLOv9 improved the mean Average Precision (mAP) by 2.1%, 1.6%, 4%, and 3.8% respectively on the four fruit datasets compared to the baseline YOLOv9-c model. The model’s internal working mechanisms were thoroughly revealed through multi-dimensional evaluation, including ablation experiments, Heatmap Analysis, and Effective Receptive Field (ERF) analysis, providing a theoretical foundation for subsequent optimization. The research findings enrich the application theory of computer vision in smart agriculture and provide reliable technical support for achieving precise orchard management.</div></div>\",\"PeriodicalId\":50627,\"journal\":{\"name\":\"Computers and Electronics in Agriculture\",\"volume\":\"236 \",\"pages\":\"Article 110401\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers and Electronics in Agriculture\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168169925005071\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005071","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
RTFVE-YOLOv9: Real-time fruit volume estimation model integrating YOLOv9 and binocular stereo vision
This study proposes a real-time fruit volume estimation model based on YOLOv9 (RTFVE-YOLOv9) and binocular stereo vision technology to address the challenges of low automation and insufficient accuracy in fruit volume measurement in complex orchard environments, particularly in scenarios with diverse canopy structures and severe branch-leaf occlusion. The model achieves effective recognition of occluded fruits through the innovative design of a Dual-Scale and Global–Local Sequence (DSGLSeq) module while incorporating a Multi-Head and Multi-Scale Self-Interaction (MHMSI) module to improve the detection performance of small fruit targets. Systematic validation experiments conducted on major economic fruit tree varieties, including apples, pears, pomelos, and kiwifruit, demonstrate that RTFVE-YOLOv9 improved the mean Average Precision (mAP) by 2.1%, 1.6%, 4%, and 3.8% respectively on the four fruit datasets compared to the baseline YOLOv9-c model. The model’s internal working mechanisms were thoroughly revealed through multi-dimensional evaluation, including ablation experiments, Heatmap Analysis, and Effective Receptive Field (ERF) analysis, providing a theoretical foundation for subsequent optimization. The research findings enrich the application theory of computer vision in smart agriculture and provide reliable technical support for achieving precise orchard management.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.