{"title":"平菇数据集的实例分割:一种用于深度学习模型训练和评估的新型数据采样方法","authors":"Christos Charisis, Meiqing Wang, Dimitrios Argyropoulos","doi":"10.1016/j.atech.2025.101146","DOIUrl":null,"url":null,"abstract":"<div><div>This paper proposes a novel data sampling methodology for training and evaluation of deep-learning instance segmentation models using a comprehensive image dataset of oyster mushroom clusters obtained from commercial farms including 25,978 single mushrooms. A custom data splitting and reduction strategy was designed to generate multiple training subsets for an in-depth model performance evaluation. Also, the study aims to examine the ability of five feature extraction backbone configurations of Mask R-CNN: i) CNN-based (ResNet50, ResNeXt101 and ConvNeXt) and ii) Transformer-based (Swin small and tiny) to accurately detect and segment single mushroom instances within the cluster in the images. To complement the standard evaluation metrics (mAP, mAR), two new metrics, namely Correctness and Instance Segmentation Quality Index (ISQI), were introduced. Correctness was used to assess the segmentation quality and ISQI to combine information from both detection (mAR) and segmentation (Correctness). The new metrics examined the consistency of the generated masks across multiple experimental runs on distinct dataset splits, reflecting the ability of the models to produce similar masks despite variations in their training data. The results revealed that ConvNeXt consistently outperformed its counterparts (mAP = 0.7675, mAR = 0.8071; Correctness = 0.9160, ISQI = 0.8598) in all dataset sizes, demonstrating superior detection ability, even in cases of high occlusion and low illumination. Swin also exhibited high detection performance (mAP = 0.7616, mAR = 0.7991; Correctness = 0.9126, ISQI = 0.8540), however with a greater dependence on the dataset size. Overall, this research highlights the importance of properly evaluating backbone architectures across different dataset sizes for developing robust DL instance segmentation models applicable to mushroom farming or other visually complex environments.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"12 ","pages":"Article 101146"},"PeriodicalIF":5.7000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Instance segmentation of oyster mushroom datasets: A novel data sampling methodology for training and evaluation of deep learning models\",\"authors\":\"Christos Charisis, Meiqing Wang, Dimitrios Argyropoulos\",\"doi\":\"10.1016/j.atech.2025.101146\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper proposes a novel data sampling methodology for training and evaluation of deep-learning instance segmentation models using a comprehensive image dataset of oyster mushroom clusters obtained from commercial farms including 25,978 single mushrooms. A custom data splitting and reduction strategy was designed to generate multiple training subsets for an in-depth model performance evaluation. Also, the study aims to examine the ability of five feature extraction backbone configurations of Mask R-CNN: i) CNN-based (ResNet50, ResNeXt101 and ConvNeXt) and ii) Transformer-based (Swin small and tiny) to accurately detect and segment single mushroom instances within the cluster in the images. To complement the standard evaluation metrics (mAP, mAR), two new metrics, namely Correctness and Instance Segmentation Quality Index (ISQI), were introduced. Correctness was used to assess the segmentation quality and ISQI to combine information from both detection (mAR) and segmentation (Correctness). The new metrics examined the consistency of the generated masks across multiple experimental runs on distinct dataset splits, reflecting the ability of the models to produce similar masks despite variations in their training data. The results revealed that ConvNeXt consistently outperformed its counterparts (mAP = 0.7675, mAR = 0.8071; Correctness = 0.9160, ISQI = 0.8598) in all dataset sizes, demonstrating superior detection ability, even in cases of high occlusion and low illumination. Swin also exhibited high detection performance (mAP = 0.7616, mAR = 0.7991; Correctness = 0.9126, ISQI = 0.8540), however with a greater dependence on the dataset size. Overall, this research highlights the importance of properly evaluating backbone architectures across different dataset sizes for developing robust DL instance segmentation models applicable to mushroom farming or other visually complex environments.</div></div>\",\"PeriodicalId\":74813,\"journal\":{\"name\":\"Smart agricultural technology\",\"volume\":\"12 \",\"pages\":\"Article 101146\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Smart agricultural technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772375525003788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525003788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
本文提出了一种新的数据采样方法,用于深度学习实例分割模型的训练和评估,该方法使用了来自商业养殖场(包括25,978个单个蘑菇)的平菇聚类综合图像数据集。设计了一种自定义数据分割和约简策略来生成多个训练子集,用于深入的模型性能评估。此外,该研究旨在检查Mask R-CNN的五种特征提取骨干配置的能力:i)基于cnn的(ResNet50, ResNeXt101和ConvNeXt)和ii)基于transformer的(Swin小而微小),以准确地检测和分割图像中集群内的单个蘑菇实例。为了补充标准的评价指标mAP、mAR,引入了两个新的指标,即正确性和实例分割质量指标ISQI。使用正确性来评估分割质量和ISQI来结合检测(mAR)和分割(正确性)的信息。新的指标检查了在不同数据集分割的多个实验运行中生成的掩码的一致性,反映了尽管模型的训练数据存在差异,但模型产生相似掩码的能力。结果显示,ConvNeXt始终优于同类算法(mAP = 0.7675, mAR = 0.8071;在所有数据集大小下,正确性= 0.9160,ISQI = 0.8598),即使在高遮挡和低照度的情况下,也显示出优越的检测能力。Swin也表现出较高的检测性能(mAP = 0.7616, mAR = 0.7991;正确性= 0.9126,ISQI = 0.8540),但是对数据集大小的依赖性更大。总的来说,本研究强调了正确评估不同数据集大小的主干架构对于开发适用于蘑菇养殖或其他视觉复杂环境的鲁棒深度学习实例分割模型的重要性。
Instance segmentation of oyster mushroom datasets: A novel data sampling methodology for training and evaluation of deep learning models
This paper proposes a novel data sampling methodology for training and evaluation of deep-learning instance segmentation models using a comprehensive image dataset of oyster mushroom clusters obtained from commercial farms including 25,978 single mushrooms. A custom data splitting and reduction strategy was designed to generate multiple training subsets for an in-depth model performance evaluation. Also, the study aims to examine the ability of five feature extraction backbone configurations of Mask R-CNN: i) CNN-based (ResNet50, ResNeXt101 and ConvNeXt) and ii) Transformer-based (Swin small and tiny) to accurately detect and segment single mushroom instances within the cluster in the images. To complement the standard evaluation metrics (mAP, mAR), two new metrics, namely Correctness and Instance Segmentation Quality Index (ISQI), were introduced. Correctness was used to assess the segmentation quality and ISQI to combine information from both detection (mAR) and segmentation (Correctness). The new metrics examined the consistency of the generated masks across multiple experimental runs on distinct dataset splits, reflecting the ability of the models to produce similar masks despite variations in their training data. The results revealed that ConvNeXt consistently outperformed its counterparts (mAP = 0.7675, mAR = 0.8071; Correctness = 0.9160, ISQI = 0.8598) in all dataset sizes, demonstrating superior detection ability, even in cases of high occlusion and low illumination. Swin also exhibited high detection performance (mAP = 0.7616, mAR = 0.7991; Correctness = 0.9126, ISQI = 0.8540), however with a greater dependence on the dataset size. Overall, this research highlights the importance of properly evaluating backbone architectures across different dataset sizes for developing robust DL instance segmentation models applicable to mushroom farming or other visually complex environments.