{"title":"APCAFlow:用于光流估计的全对成本体积聚合法","authors":"Miaojie Feng;Hao Jia;Zengqiang Yan;Xin Yang","doi":"10.1109/TMM.2024.3385669","DOIUrl":null,"url":null,"abstract":"Optical flow estimation is a fundamental task in computer vision. The all-pairs correlation volume has enabled state-of-the-art performance in many optical flow estimation methods. However, all-pairs correlations provide only local matching clues, and lack global context, which could lead to mismatches in textureless and occluded regions. In this paper, we propose a novel all-pairs correlation volume aggregation (APCA) method which includes two key innovations. The first is a cost volume splitting and reassembling approach which partitions the full cost volume into smaller blocks and re-arranges those blocks to allow the use of 2D and 3D convolutions for cost volume aggregation. The second is hierarchical aggregation which performs 2D convolutions within blocks for local matching aggregation and 3D convolutions across blocks for global matching aggregation. We further design a novel optical flow estimation network APCAFlow based on APCA. APCAFlow achieves comparable performance to the most advanced approach, FlowFormer, but with significantly lower complexity. Specifically, APCAFlow reduces the model parameters, inference time, and memory consumption by 24.1%, 35.5%, and 21.6%, respectively, compared to FlowFormer. Furthermore, APCA can be easily integrated into several existing all-pairs cost volume-based methods for performance improvement.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9060-9069"},"PeriodicalIF":9.7000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"APCAFlow: All-Pairs Cost Volume Aggregation for Optical Flow Estimation\",\"authors\":\"Miaojie Feng;Hao Jia;Zengqiang Yan;Xin Yang\",\"doi\":\"10.1109/TMM.2024.3385669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical flow estimation is a fundamental task in computer vision. The all-pairs correlation volume has enabled state-of-the-art performance in many optical flow estimation methods. However, all-pairs correlations provide only local matching clues, and lack global context, which could lead to mismatches in textureless and occluded regions. In this paper, we propose a novel all-pairs correlation volume aggregation (APCA) method which includes two key innovations. The first is a cost volume splitting and reassembling approach which partitions the full cost volume into smaller blocks and re-arranges those blocks to allow the use of 2D and 3D convolutions for cost volume aggregation. The second is hierarchical aggregation which performs 2D convolutions within blocks for local matching aggregation and 3D convolutions across blocks for global matching aggregation. We further design a novel optical flow estimation network APCAFlow based on APCA. APCAFlow achieves comparable performance to the most advanced approach, FlowFormer, but with significantly lower complexity. Specifically, APCAFlow reduces the model parameters, inference time, and memory consumption by 24.1%, 35.5%, and 21.6%, respectively, compared to FlowFormer. Furthermore, APCA can be easily integrated into several existing all-pairs cost volume-based methods for performance improvement.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"26 \",\"pages\":\"9060-9069\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2024-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10494553/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10494553/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
APCAFlow: All-Pairs Cost Volume Aggregation for Optical Flow Estimation
Optical flow estimation is a fundamental task in computer vision. The all-pairs correlation volume has enabled state-of-the-art performance in many optical flow estimation methods. However, all-pairs correlations provide only local matching clues, and lack global context, which could lead to mismatches in textureless and occluded regions. In this paper, we propose a novel all-pairs correlation volume aggregation (APCA) method which includes two key innovations. The first is a cost volume splitting and reassembling approach which partitions the full cost volume into smaller blocks and re-arranges those blocks to allow the use of 2D and 3D convolutions for cost volume aggregation. The second is hierarchical aggregation which performs 2D convolutions within blocks for local matching aggregation and 3D convolutions across blocks for global matching aggregation. We further design a novel optical flow estimation network APCAFlow based on APCA. APCAFlow achieves comparable performance to the most advanced approach, FlowFormer, but with significantly lower complexity. Specifically, APCAFlow reduces the model parameters, inference time, and memory consumption by 24.1%, 35.5%, and 21.6%, respectively, compared to FlowFormer. Furthermore, APCA can be easily integrated into several existing all-pairs cost volume-based methods for performance improvement.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.