Yiran Liu , Beibei Li , Dingshuo Liu , Qingling Duan
{"title":"Adaptive spatial aggregation and viewpoint alignment for three-dimensional online multiple fish tracking","authors":"Yiran Liu , Beibei Li , Dingshuo Liu , Qingling Duan","doi":"10.1016/j.compag.2025.110408","DOIUrl":null,"url":null,"abstract":"<div><div>Three-dimensional (3D) multi-object tracking can simultaneously capture the movement trajectories of multiple fish, which is essential for understanding and analysing their movements and behavioural patterns in 3D space. It also provides essential data for applications such as water-quality monitoring, disease diagnosis, and ecological assessment. However, the multi-object tracking of fish in 3D space requires data associations across different perspectives. Variations in scale and appearance across perspectives can lead to inaccurate object positioning and low identification rates. In response to these challenges, in this study, an online 3D multi-object tracking method for fish is proposed based on adaptive spatial aggregation and viewpoint alignment. Dynamic deformable convolution networks (DCNv3) and upsampling techniques are employed to adaptively fuse the fixed-scale features generated by the backbone network, addressing the difficulties in object positioning caused by scale differences. The trajectories of the fish from both the top and side views are then obtained using a cascade tracker. Finally, a viewpoint-alignment approach is proposed to reconstruct the trajectories in 3D space using the two-dimensional (2D) trajectories, thereby avoiding the identity recognition issues caused by drastic changes in appearance. In verifying the effectiveness of the proposed algorithm on the 3D-ZeF20 zebrafish dataset, multi-object tracking accuracy (MOTA) reached 95.03 %; identification F1-score (IDF1) was 97.40 %; and monotonic mean time between failures (MTBFm) was 172 frames. The results demonstrate that this method addresses the difficulties in cross-view matching caused by changes in appearance and scale differences. It enables the simultaneous acquisition of fish multi-object trajectories from front view, top view, and in 3D space, thereby achieving precise online tracking of multiple fish.</div></div>","PeriodicalId":50627,"journal":{"name":"Computers and Electronics in Agriculture","volume":"236 ","pages":"Article 110408"},"PeriodicalIF":7.7000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers and Electronics in Agriculture","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168169925005149","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Three-dimensional (3D) multi-object tracking can simultaneously capture the movement trajectories of multiple fish, which is essential for understanding and analysing their movements and behavioural patterns in 3D space. It also provides essential data for applications such as water-quality monitoring, disease diagnosis, and ecological assessment. However, the multi-object tracking of fish in 3D space requires data associations across different perspectives. Variations in scale and appearance across perspectives can lead to inaccurate object positioning and low identification rates. In response to these challenges, in this study, an online 3D multi-object tracking method for fish is proposed based on adaptive spatial aggregation and viewpoint alignment. Dynamic deformable convolution networks (DCNv3) and upsampling techniques are employed to adaptively fuse the fixed-scale features generated by the backbone network, addressing the difficulties in object positioning caused by scale differences. The trajectories of the fish from both the top and side views are then obtained using a cascade tracker. Finally, a viewpoint-alignment approach is proposed to reconstruct the trajectories in 3D space using the two-dimensional (2D) trajectories, thereby avoiding the identity recognition issues caused by drastic changes in appearance. In verifying the effectiveness of the proposed algorithm on the 3D-ZeF20 zebrafish dataset, multi-object tracking accuracy (MOTA) reached 95.03 %; identification F1-score (IDF1) was 97.40 %; and monotonic mean time between failures (MTBFm) was 172 frames. The results demonstrate that this method addresses the difficulties in cross-view matching caused by changes in appearance and scale differences. It enables the simultaneous acquisition of fish multi-object trajectories from front view, top view, and in 3D space, thereby achieving precise online tracking of multiple fish.
期刊介绍:
Computers and Electronics in Agriculture provides international coverage of advancements in computer hardware, software, electronic instrumentation, and control systems applied to agricultural challenges. Encompassing agronomy, horticulture, forestry, aquaculture, and animal farming, the journal publishes original papers, reviews, and applications notes. It explores the use of computers and electronics in plant or animal agricultural production, covering topics like agricultural soils, water, pests, controlled environments, and waste. The scope extends to on-farm post-harvest operations and relevant technologies, including artificial intelligence, sensors, machine vision, robotics, networking, and simulation modeling. Its companion journal, Smart Agricultural Technology, continues the focus on smart applications in production agriculture.