{"title":"利用递归神经网络优化立体差异的匹配代价","authors":"Alper Emlek, Murat Peker","doi":"10.1186/s13640-021-00551-9","DOIUrl":null,"url":null,"abstract":"<p>Depth is essential information for autonomous robotics applications that need environmental depth values. The depth could be acquired by finding the matching pixels between stereo image pairs. Depth information is an inference from a matching cost volume that is composed of the distances between the possible pixel points on the pre-aligned horizontal axis of stereo images. Most approaches use matching costs to identify matches between stereo images and obtain depth information. Recently, researchers have been using convolutional neural network-based solutions to handle this matching problem. In this paper, a novel method has been proposed for the refinement of matching costs by using recurrent neural networks. Our motivation is to enhance the depth values obtained from matching costs. For this purpose, to attain an enhanced disparity map by utilizing the sequential information of matching costs in the horizontal space, recurrent neural networks are used. Exploiting this sequential information, we aimed to determine the position of the correct matching point by using recurrent neural networks, as in the case of speech processing problems. We used existing stereo algorithms to obtain the initial matching costs and then improved the results by utilizing recurrent neural networks. The results are evaluated on the KITTI 2012 and KITTI 2015 datasets. The results show that the matching cost three-pixel error is decreased by an average of 14.5% in both datasets.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"88 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2021-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Refinement of matching costs for stereo disparities using recurrent neural networks\",\"authors\":\"Alper Emlek, Murat Peker\",\"doi\":\"10.1186/s13640-021-00551-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Depth is essential information for autonomous robotics applications that need environmental depth values. The depth could be acquired by finding the matching pixels between stereo image pairs. Depth information is an inference from a matching cost volume that is composed of the distances between the possible pixel points on the pre-aligned horizontal axis of stereo images. Most approaches use matching costs to identify matches between stereo images and obtain depth information. Recently, researchers have been using convolutional neural network-based solutions to handle this matching problem. In this paper, a novel method has been proposed for the refinement of matching costs by using recurrent neural networks. Our motivation is to enhance the depth values obtained from matching costs. For this purpose, to attain an enhanced disparity map by utilizing the sequential information of matching costs in the horizontal space, recurrent neural networks are used. Exploiting this sequential information, we aimed to determine the position of the correct matching point by using recurrent neural networks, as in the case of speech processing problems. We used existing stereo algorithms to obtain the initial matching costs and then improved the results by utilizing recurrent neural networks. The results are evaluated on the KITTI 2012 and KITTI 2015 datasets. The results show that the matching cost three-pixel error is decreased by an average of 14.5% in both datasets.</p>\",\"PeriodicalId\":49322,\"journal\":{\"name\":\"Eurasip Journal on Image and Video Processing\",\"volume\":\"88 1\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2021-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Eurasip Journal on Image and Video Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1186/s13640-021-00551-9\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Image and Video Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13640-021-00551-9","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Refinement of matching costs for stereo disparities using recurrent neural networks
Depth is essential information for autonomous robotics applications that need environmental depth values. The depth could be acquired by finding the matching pixels between stereo image pairs. Depth information is an inference from a matching cost volume that is composed of the distances between the possible pixel points on the pre-aligned horizontal axis of stereo images. Most approaches use matching costs to identify matches between stereo images and obtain depth information. Recently, researchers have been using convolutional neural network-based solutions to handle this matching problem. In this paper, a novel method has been proposed for the refinement of matching costs by using recurrent neural networks. Our motivation is to enhance the depth values obtained from matching costs. For this purpose, to attain an enhanced disparity map by utilizing the sequential information of matching costs in the horizontal space, recurrent neural networks are used. Exploiting this sequential information, we aimed to determine the position of the correct matching point by using recurrent neural networks, as in the case of speech processing problems. We used existing stereo algorithms to obtain the initial matching costs and then improved the results by utilizing recurrent neural networks. The results are evaluated on the KITTI 2012 and KITTI 2015 datasets. The results show that the matching cost three-pixel error is decreased by an average of 14.5% in both datasets.
期刊介绍:
EURASIP Journal on Image and Video Processing is intended for researchers from both academia and industry, who are active in the multidisciplinary field of image and video processing. The scope of the journal covers all theoretical and practical aspects of the domain, from basic research to development of application.