{"title":"FARnet: Farming Action Recognition From Videos Based on Coordinate Attention and YOLOv7-tiny Network in Aquaculture","authors":"Xinting Yang, Liang Pan, Dinghong Wang, Yuhao Zeng, Wentao Zhu, Dongxiang Jiao, Zhenlong Sun, Chuanheng Sun, Chao Zhou","doi":"10.13031/ja.15362","DOIUrl":null,"url":null,"abstract":"Highlights The automatic detection and recognition of farming action in video are realized. The YOLOv7-tiny was enhanced by incorporating Coordinate Attention (CA). The performance indices mAP@.5 and mAP@.5:.95 improved by 0.1% and 6.6%, respectively. An intelligent method for detecting \"inspection\" and \"applying pesticides\" is provided. Abstract. In aquaculture, regular \"inspection\" and \"applying pesticides\" are essential to improving production efficiency and fish disease treatment, but the current aquaculture system does not effectively support these strategies. Therefore, this paper proposes a farming action recognition network (FARnet), which can accurately locate the farmers in the video and detect the actions of “applying pesticides” and “inspection.” The dataset was captured and produced by multi-angle cameras, which were consulted with relevant experts. In this network, Coordinate Attention (CA) was used to improve the Efficient Layer Aggregation Networks-tiny (ELAN-tiny) and Spatial Pyramid Pooling (SPP) structures in the YOLOv7-tiny network. The precise implementation methods are as follows: (1) The convolution in ELAN-tiny was replaced with the CA module, and a shortcut was added. (2) A CA module was added to the final layer of the Spatial Pyramid Pooling (SPP) module. (3) The improved Efficient Layer Aggregation Networks-Coordinate Attention (ELAN-CA) and Spatial Pyramid Pooling-Coordinate Attention (SPP-CA) were used to extract action features and perform feature correction by ADD (Feature fusion by feature map summation) in the backbone. The results demonstrated that the FARnet achieved significantly better detection results than the YOLOv7-tiny network, where mAP@.5 improved by 0.1% from 99.4% to 99.5%, and the mAP@.5:.95 improved by 6.6% from 78.2% to 84.8%. Therefore, the FARnet can effectively detect and identify the “inspection” and “applying pesticides” actions of farmers and provide useful input information for the intelligent management system. Keywords: Action detection, Applying pesticides, Coordinate attention, FARnet, Inspection.","PeriodicalId":29714,"journal":{"name":"Journal of the ASABE","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the ASABE","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13031/ja.15362","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Highlights The automatic detection and recognition of farming action in video are realized. The YOLOv7-tiny was enhanced by incorporating Coordinate Attention (CA). The performance indices mAP@.5 and mAP@.5:.95 improved by 0.1% and 6.6%, respectively. An intelligent method for detecting "inspection" and "applying pesticides" is provided. Abstract. In aquaculture, regular "inspection" and "applying pesticides" are essential to improving production efficiency and fish disease treatment, but the current aquaculture system does not effectively support these strategies. Therefore, this paper proposes a farming action recognition network (FARnet), which can accurately locate the farmers in the video and detect the actions of “applying pesticides” and “inspection.” The dataset was captured and produced by multi-angle cameras, which were consulted with relevant experts. In this network, Coordinate Attention (CA) was used to improve the Efficient Layer Aggregation Networks-tiny (ELAN-tiny) and Spatial Pyramid Pooling (SPP) structures in the YOLOv7-tiny network. The precise implementation methods are as follows: (1) The convolution in ELAN-tiny was replaced with the CA module, and a shortcut was added. (2) A CA module was added to the final layer of the Spatial Pyramid Pooling (SPP) module. (3) The improved Efficient Layer Aggregation Networks-Coordinate Attention (ELAN-CA) and Spatial Pyramid Pooling-Coordinate Attention (SPP-CA) were used to extract action features and perform feature correction by ADD (Feature fusion by feature map summation) in the backbone. The results demonstrated that the FARnet achieved significantly better detection results than the YOLOv7-tiny network, where mAP@.5 improved by 0.1% from 99.4% to 99.5%, and the mAP@.5:.95 improved by 6.6% from 78.2% to 84.8%. Therefore, the FARnet can effectively detect and identify the “inspection” and “applying pesticides” actions of farmers and provide useful input information for the intelligent management system. Keywords: Action detection, Applying pesticides, Coordinate attention, FARnet, Inspection.