{"title":"工业水产养殖中鱼类摄食行为识别的多模态知识精馏框架","authors":"Zheng Zhang, Bosheng Zou, Qingsong Hu, Weiqian Li","doi":"10.1016/j.biosystemseng.2025.104170","DOIUrl":null,"url":null,"abstract":"<div><div>Fish feeding behaviour recognition based on machine vision has great significance in industrial aquaculture. Due to the problems of turbid water and overlapping fish during the feeding stage, accurate and low-cost feeding behaviour recognition becomes challenging in actual industrial aquaculture. To address these issues, a novel Multimodal Knowledge Distillation Recognition (MMKDR) framework, based on multimodal fusion and enhanced knowledge distillation, is proposed, to achieve low-complexity and low-cost deployment. Specifically, we utilised the Feature Extraction module of ConvNeXt-T (CNXFE) to extract image features from video streaming. Then, an Improved Multimodal Fusion (IMF) module is designed to generate the fused feature, which can dynamically adjust the weights of the image and water quality features. Next, a Lightweight Feeding Intensity Classification (LFIC) module is designed to predict the fish feeding intensity from the fused feature, which helps to optimise feeding strategies and reduce aquaculture management cost. To deploy the student model on low-cost embedded devices, we further reduce the parameters of the CNXFE and IMF, and obtain the smaller student model with 2.49M parameters. An Enhanced Knowledge Distillation (EKD) scheme, with semi-supervised domain adaptation, is present to achieve knowledge transfer with better recognition accuracy. It can reduce devices and data annotation costs to promote low-cost and low-carbon aquaculture. We carried out experiments and evaluated MMKDR in a real industrial aquaculture environment. The results demonstrated that the student model achieved an accuracy of 96.65 % on the testing set, and an accuracy of 91.36 % using low-cost embedded device in real industrial aquaculture scenarios.</div></div>","PeriodicalId":9173,"journal":{"name":"Biosystems Engineering","volume":"255 ","pages":"Article 104170"},"PeriodicalIF":4.4000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multimodal knowledge distillation framework for fish feeding behaviour recognition in industrial aquaculture\",\"authors\":\"Zheng Zhang, Bosheng Zou, Qingsong Hu, Weiqian Li\",\"doi\":\"10.1016/j.biosystemseng.2025.104170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fish feeding behaviour recognition based on machine vision has great significance in industrial aquaculture. Due to the problems of turbid water and overlapping fish during the feeding stage, accurate and low-cost feeding behaviour recognition becomes challenging in actual industrial aquaculture. To address these issues, a novel Multimodal Knowledge Distillation Recognition (MMKDR) framework, based on multimodal fusion and enhanced knowledge distillation, is proposed, to achieve low-complexity and low-cost deployment. Specifically, we utilised the Feature Extraction module of ConvNeXt-T (CNXFE) to extract image features from video streaming. Then, an Improved Multimodal Fusion (IMF) module is designed to generate the fused feature, which can dynamically adjust the weights of the image and water quality features. Next, a Lightweight Feeding Intensity Classification (LFIC) module is designed to predict the fish feeding intensity from the fused feature, which helps to optimise feeding strategies and reduce aquaculture management cost. To deploy the student model on low-cost embedded devices, we further reduce the parameters of the CNXFE and IMF, and obtain the smaller student model with 2.49M parameters. An Enhanced Knowledge Distillation (EKD) scheme, with semi-supervised domain adaptation, is present to achieve knowledge transfer with better recognition accuracy. It can reduce devices and data annotation costs to promote low-cost and low-carbon aquaculture. We carried out experiments and evaluated MMKDR in a real industrial aquaculture environment. The results demonstrated that the student model achieved an accuracy of 96.65 % on the testing set, and an accuracy of 91.36 % using low-cost embedded device in real industrial aquaculture scenarios.</div></div>\",\"PeriodicalId\":9173,\"journal\":{\"name\":\"Biosystems Engineering\",\"volume\":\"255 \",\"pages\":\"Article 104170\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biosystems Engineering\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1537511025001060\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1537511025001060","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
Multimodal knowledge distillation framework for fish feeding behaviour recognition in industrial aquaculture
Fish feeding behaviour recognition based on machine vision has great significance in industrial aquaculture. Due to the problems of turbid water and overlapping fish during the feeding stage, accurate and low-cost feeding behaviour recognition becomes challenging in actual industrial aquaculture. To address these issues, a novel Multimodal Knowledge Distillation Recognition (MMKDR) framework, based on multimodal fusion and enhanced knowledge distillation, is proposed, to achieve low-complexity and low-cost deployment. Specifically, we utilised the Feature Extraction module of ConvNeXt-T (CNXFE) to extract image features from video streaming. Then, an Improved Multimodal Fusion (IMF) module is designed to generate the fused feature, which can dynamically adjust the weights of the image and water quality features. Next, a Lightweight Feeding Intensity Classification (LFIC) module is designed to predict the fish feeding intensity from the fused feature, which helps to optimise feeding strategies and reduce aquaculture management cost. To deploy the student model on low-cost embedded devices, we further reduce the parameters of the CNXFE and IMF, and obtain the smaller student model with 2.49M parameters. An Enhanced Knowledge Distillation (EKD) scheme, with semi-supervised domain adaptation, is present to achieve knowledge transfer with better recognition accuracy. It can reduce devices and data annotation costs to promote low-cost and low-carbon aquaculture. We carried out experiments and evaluated MMKDR in a real industrial aquaculture environment. The results demonstrated that the student model achieved an accuracy of 96.65 % on the testing set, and an accuracy of 91.36 % using low-cost embedded device in real industrial aquaculture scenarios.
期刊介绍:
Biosystems Engineering publishes research in engineering and the physical sciences that represent advances in understanding or modelling of the performance of biological systems for sustainable developments in land use and the environment, agriculture and amenity, bioproduction processes and the food chain. The subject matter of the journal reflects the wide range and interdisciplinary nature of research in engineering for biological systems.