MixSegNext: A CNN-Transformer hybrid model for semantic segmentation and picking point localization algorithm of Sichuan pepper in natural environments
Pengjun Xiang , Fei Pan , Tao Liu , Xiaoyu Zhao , Mengdie Hu , Dawei He , Boda Zhang
{"title":"MixSegNext: A CNN-Transformer hybrid model for semantic segmentation and picking point localization algorithm of Sichuan pepper in natural environments","authors":"Pengjun Xiang , Fei Pan , Tao Liu , Xiaoyu Zhao , Mengdie Hu , Dawei He , Boda Zhang","doi":"10.1016/j.compag.2025.110564","DOIUrl":null,"url":null,"abstract":"<div><div>Precise identification of Sichuan pepper picking points is a prerequisite for the robotic harvesting of the crop. Picking robots typically operate in open, dynamic natural environments, which demands robustness in the Sichuan pepper picking point localization algorithm. Generally, the growth environment of Sichuan pepper is complex, and the growth posture varies. The branches of the pepper clusters are similar to the pepper branches, which can easily lead to misjudgment and omission in the localization process, making accurate visual picking point localization challenging. To rapidly and accurately locate target Sichuan pepper picking points in natural environments, this paper proposes a Sichuan pepper segmentation model and picking point localization algorithm based on MixSegNext. The algorithm is divided into three main parts. First, the MixSegNext network performs semantic segmentation on Sichuan pepper clusters and fruits to extract the picking targets. Then, by subtracting the extracted pepper fruit mask from the pepper cluster mask, the Sichuan pepper branch mask is obtained, and the main pepper branch mask is acquired through morphological operations and maximal connectivity analysis. Finally, edge extraction is performed on the main pepper branch mask, and the picking point is determined by finding the intersection between the central line of the contour and the edge. This paper compares MixSegNext with typical semantic segmentation networks and conducts picking point localization experiments. The results show that the network has better segmentation precision and high picking point localization accuracy. Furthermore, this paper deploys the network on embedded devices to perform Sichuan pepper inference segmentation, verifying the application effect of the algorithm, which can provide a reference for the visual positioning system of Sichuan pepper-picking robots.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"237 ","pages":"Article 110564"},"PeriodicalIF":7.7000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925006702","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Precise identification of Sichuan pepper picking points is a prerequisite for the robotic harvesting of the crop. Picking robots typically operate in open, dynamic natural environments, which demands robustness in the Sichuan pepper picking point localization algorithm. Generally, the growth environment of Sichuan pepper is complex, and the growth posture varies. The branches of the pepper clusters are similar to the pepper branches, which can easily lead to misjudgment and omission in the localization process, making accurate visual picking point localization challenging. To rapidly and accurately locate target Sichuan pepper picking points in natural environments, this paper proposes a Sichuan pepper segmentation model and picking point localization algorithm based on MixSegNext. The algorithm is divided into three main parts. First, the MixSegNext network performs semantic segmentation on Sichuan pepper clusters and fruits to extract the picking targets. Then, by subtracting the extracted pepper fruit mask from the pepper cluster mask, the Sichuan pepper branch mask is obtained, and the main pepper branch mask is acquired through morphological operations and maximal connectivity analysis. Finally, edge extraction is performed on the main pepper branch mask, and the picking point is determined by finding the intersection between the central line of the contour and the edge. This paper compares MixSegNext with typical semantic segmentation networks and conducts picking point localization experiments. The results show that the network has better segmentation precision and high picking point localization accuracy. Furthermore, this paper deploys the network on embedded devices to perform Sichuan pepper inference segmentation, verifying the application effect of the algorithm, which can provide a reference for the visual positioning system of Sichuan pepper-picking robots.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.