{"title":"Object Detection on Monocular Images with Two- Dimensional Canonical Correlation Analysis","authors":"Zifan Yu, Suya You","doi":"10.1109/ICPR48806.2021.9412067","DOIUrl":null,"url":null,"abstract":"Accurate and robust detection of objects from monocular images is a fundamental vision task. This paper describes a novel approach of holistic scene understanding that can simultaneously achieve multiple tasks of scene reconstruction and object detection from a single monocular image. Rather than pursuing an independent solution for each individual task as most existing work does, we seek a globally optimal solution that holistically resolves the multiple perception and reasoning tasks in an effective manner. The approach explores the complementary properties of multimodal RGB images and depth data to improve scene perception tasks. It uniquely combines the techniques of canonical correlation analysis and deep learning to learn the most correlated features to maximize the modal cross-correlation for improving performance and robustness of object detection in complex environments. Extensive experiments have been conducted to evaluate and demonstrate the performances of proposed approach.","PeriodicalId":6783,"journal":{"name":"2020 25th International Conference on Pattern Recognition (ICPR)","volume":"90 1","pages":"5905-5912"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 25th International Conference on Pattern Recognition (ICPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPR48806.2021.9412067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate and robust detection of objects from monocular images is a fundamental vision task. This paper describes a novel approach of holistic scene understanding that can simultaneously achieve multiple tasks of scene reconstruction and object detection from a single monocular image. Rather than pursuing an independent solution for each individual task as most existing work does, we seek a globally optimal solution that holistically resolves the multiple perception and reasoning tasks in an effective manner. The approach explores the complementary properties of multimodal RGB images and depth data to improve scene perception tasks. It uniquely combines the techniques of canonical correlation analysis and deep learning to learn the most correlated features to maximize the modal cross-correlation for improving performance and robustness of object detection in complex environments. Extensive experiments have been conducted to evaluate and demonstrate the performances of proposed approach.