ℋC-search for structured prediction in computer vision

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2015-06-07 DOI:10.1109/CVPR.2015.7299126

Michael Lam, J. Doppa, S. Todorovic, Thomas G. Dietterich

{"title":"ℋC-search for structured prediction in computer vision","authors":"Michael Lam, J. Doppa, S. Todorovic, Thomas G. Dietterich","doi":"10.1109/CVPR.2015.7299126","DOIUrl":null,"url":null,"abstract":"The mainstream approach to structured prediction problems in computer vision is to learn an energy function such that the solution minimizes that function. At prediction time, this approach must solve an often-challenging optimization problem. Search-based methods provide an alternative that has the potential to achieve higher performance. These methods learn to control a search procedure that constructs and evaluates candidate solutions. The recently-developed ℋC-Search method has been shown to achieve state-of-the-art results in natural language processing, but mixed success when applied to vision problems. This paper studies whether ℋC-Search can achieve similarly competitive performance on basic vision tasks such as object detection, scene labeling, and monocular depth estimation, where the leading paradigm is energy minimization. To this end, we introduce a search operator suited to the vision domain that improves a candidate solution by probabilistically sampling likely object configurations in the scene from the hierarchical Berkeley segmentation. We complement this search operator by applying the DAgger algorithm to robustly train the search heuristic so it learns from its previous mistakes. Our evaluation shows that these improvements reduce the branching factor and search depth, and thus give a significant performance boost. Our state-of-the-art results on scene labeling and depth estimation suggest that ℋC-Search provides a suitable tool for learning and inference in vision.","PeriodicalId":444472,"journal":{"name":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2015.7299126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

The mainstream approach to structured prediction problems in computer vision is to learn an energy function such that the solution minimizes that function. At prediction time, this approach must solve an often-challenging optimization problem. Search-based methods provide an alternative that has the potential to achieve higher performance. These methods learn to control a search procedure that constructs and evaluates candidate solutions. The recently-developed ℋC-Search method has been shown to achieve state-of-the-art results in natural language processing, but mixed success when applied to vision problems. This paper studies whether ℋC-Search can achieve similarly competitive performance on basic vision tasks such as object detection, scene labeling, and monocular depth estimation, where the leading paradigm is energy minimization. To this end, we introduce a search operator suited to the vision domain that improves a candidate solution by probabilistically sampling likely object configurations in the scene from the hierarchical Berkeley segmentation. We complement this search operator by applying the DAgger algorithm to robustly train the search heuristic so it learns from its previous mistakes. Our evaluation shows that these improvements reduce the branching factor and search depth, and thus give a significant performance boost. Our state-of-the-art results on scene labeling and depth estimation suggest that ℋC-Search provides a suitable tool for learning and inference in vision.

查看原文本刊更多论文

计算机视觉中结构化预测的h - c搜索

计算机视觉中结构化预测问题的主流方法是学习能量函数，使解最小化该函数。在预测时，这种方法必须解决一个经常具有挑战性的优化问题。基于搜索的方法提供了一种可能实现更高性能的替代方法。这些方法学习控制构造和评估候选解的搜索过程。最近开发的h - c搜索方法已被证明在自然语言处理中取得了最先进的结果，但在应用于视觉问题时取得了不同的成功。本文研究了在以能量最小化为主要范式的基本视觉任务(如目标检测、场景标记和单目深度估计)中，h - h搜索是否能取得类似的竞争性性能。为此，我们引入了一种适合于视觉域的搜索算子，该算子通过从分层伯克利分割中对场景中可能的对象配置进行概率采样来改进候选解决方案。我们通过应用DAgger算法对搜索启发式进行鲁棒训练，使其从之前的错误中学习，从而补充了该搜索算子。我们的评估表明，这些改进减少了分支因子和搜索深度，从而显著提高了性能。我们在场景标记和深度估计方面的最新研究结果表明，h - c搜索为视觉学习和推理提供了一个合适的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量