{"title":"TomPhenoNet: A multi-modal fusion and multi-task learning network model for monitoring growth parameters of dwarf tomatoes","authors":"Xunyi Ma , Yanxu Wu , Zhixian Lin, Tao Lin","doi":"10.1016/j.compag.2025.110387","DOIUrl":null,"url":null,"abstract":"<div><div>Dwarf tomatoes, with high edible and ornamental value, require monitoring multiple growth parameters to balance yield and aesthetics. While deep learning has been widely applied in phenotype monitoring, most studies focus on individual growth parameters, overlooking intrinsic relationships. To simultaneously monitor multiple growth parameters across the entire growth stage and different cultivars, this study develops a multi-modal multi-task phenotype monitoring network for dwarf tomatoes (TomPhenoNet). The network model utilizes top-view RGB-D images to evaluate four key growth parameters: height, leaf area, fresh weight, and the number of red fruits. TomPhenoNet generates mask images, fruit detection features, and the number of detected fruits based on RGB images. By fusing RGB-D images, mask images, and fruit detection features, and introducing the cross-stitch network, the network predicts plant height, leaf area, and fresh weight. The predicted values are further used to generate the dynamic occlusion coefficient, adjusting the number of detected fruits to accurately predict the number of red fruits. Results reveal that TomPhenoNet achieves high prediction performances, with R<sup>2</sup> values of 0.828, 0.930, 0.945, and 0.881 for plant height, leaf area, fresh weight, and the number of red fruits, respectively. Ablation experiments show that the cross-stitch network and fruit detection features improve the prediction performances of growth parameters, with TomPhenoNet combining both modules performing best. Feature importance analysis indicates the network model captures plant growth characteristics and corrects the impact of leaf occlusion from the top view. This study promotes accurate tomato monitoring and provides data support for optimizing cultivation strategies.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"235 ","pages":"Article 110387"},"PeriodicalIF":7.7000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925004934","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Dwarf tomatoes, with high edible and ornamental value, require monitoring multiple growth parameters to balance yield and aesthetics. While deep learning has been widely applied in phenotype monitoring, most studies focus on individual growth parameters, overlooking intrinsic relationships. To simultaneously monitor multiple growth parameters across the entire growth stage and different cultivars, this study develops a multi-modal multi-task phenotype monitoring network for dwarf tomatoes (TomPhenoNet). The network model utilizes top-view RGB-D images to evaluate four key growth parameters: height, leaf area, fresh weight, and the number of red fruits. TomPhenoNet generates mask images, fruit detection features, and the number of detected fruits based on RGB images. By fusing RGB-D images, mask images, and fruit detection features, and introducing the cross-stitch network, the network predicts plant height, leaf area, and fresh weight. The predicted values are further used to generate the dynamic occlusion coefficient, adjusting the number of detected fruits to accurately predict the number of red fruits. Results reveal that TomPhenoNet achieves high prediction performances, with R2 values of 0.828, 0.930, 0.945, and 0.881 for plant height, leaf area, fresh weight, and the number of red fruits, respectively. Ablation experiments show that the cross-stitch network and fruit detection features improve the prediction performances of growth parameters, with TomPhenoNet combining both modules performing best. Feature importance analysis indicates the network model captures plant growth characteristics and corrects the impact of leaf occlusion from the top view. This study promotes accurate tomato monitoring and provides data support for optimizing cultivation strategies.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.