基于样本核自适应的视觉识别与分组

2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI:10.1109/ICCV.2015.148

Borislav Antic, B. Ommer

{"title":"基于样本核自适应的视觉识别与分组","authors":"Borislav Antic, B. Ommer","doi":"10.1109/ICCV.2015.148","DOIUrl":null,"url":null,"abstract":"Object, action, or scene representations that are corrupted by noise significantly impair the performance of visual recognition. Typically, partial occlusion, clutter, or excessive articulation affects only a subset of all feature dimensions and, most importantly, different dimensions are corrupted in different samples. Nevertheless, the common approach to this problem in feature selection and kernel methods is to down-weight or eliminate entire training samples or the same dimensions of all samples. Thus, valuable signal is lost, resulting in suboptimal classification. Our goal is, therefore, to adjust the contribution of individual feature dimensions when comparing any two samples and computing their similarity. Consequently, per-sample selection of informative dimensions is directly integrated into kernel computation. The interrelated problems of learning the parameters of a kernel classifier and determining the informative components of each sample are then addressed in a joint objective function. The approach can be integrated into the learning stage of any kernel-based visual recognition problem and it does not affect the computational performance in the retrieval phase. Experiments on diverse challenges of action recognition in videos and indoor scene classification show the general applicability of the approach and its ability to improve learning of visual representations.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"6 1","pages":"1251-1259"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Per-Sample Kernel Adaptation for Visual Recognition and Grouping\",\"authors\":\"Borislav Antic, B. Ommer\",\"doi\":\"10.1109/ICCV.2015.148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Object, action, or scene representations that are corrupted by noise significantly impair the performance of visual recognition. Typically, partial occlusion, clutter, or excessive articulation affects only a subset of all feature dimensions and, most importantly, different dimensions are corrupted in different samples. Nevertheless, the common approach to this problem in feature selection and kernel methods is to down-weight or eliminate entire training samples or the same dimensions of all samples. Thus, valuable signal is lost, resulting in suboptimal classification. Our goal is, therefore, to adjust the contribution of individual feature dimensions when comparing any two samples and computing their similarity. Consequently, per-sample selection of informative dimensions is directly integrated into kernel computation. The interrelated problems of learning the parameters of a kernel classifier and determining the informative components of each sample are then addressed in a joint objective function. The approach can be integrated into the learning stage of any kernel-based visual recognition problem and it does not affect the computational performance in the retrieval phase. Experiments on diverse challenges of action recognition in videos and indoor scene classification show the general applicability of the approach and its ability to improve learning of visual representations.\",\"PeriodicalId\":6633,\"journal\":{\"name\":\"2015 IEEE International Conference on Computer Vision (ICCV)\",\"volume\":\"6 1\",\"pages\":\"1251-1259\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Conference on Computer Vision (ICCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2015.148\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Computer Vision (ICCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2015.148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

被噪声破坏的对象、动作或场景表示会严重损害视觉识别的性能。通常，部分遮挡、杂波或过度衔接只影响所有特征维度的一个子集，最重要的是，不同的维度在不同的样本中被破坏。然而，在特征选择和核方法中，解决这一问题的常用方法是降权或消除整个训练样本或所有样本的相同维度。因此，有价值的信号丢失，导致次优分类。因此，我们的目标是在比较任意两个样本并计算它们的相似性时调整单个特征维度的贡献。因此，每样本信息维度的选择直接集成到核计算中。学习核分类器的参数和确定每个样本的信息成分的相关问题，然后在联合目标函数中得到解决。该方法可以集成到任何基于核的视觉识别问题的学习阶段，并且不影响检索阶段的计算性能。对视频和室内场景分类中各种动作识别挑战的实验表明了该方法的普遍适用性及其提高视觉表征学习的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Per-Sample Kernel Adaptation for Visual Recognition and Grouping

Object, action, or scene representations that are corrupted by noise significantly impair the performance of visual recognition. Typically, partial occlusion, clutter, or excessive articulation affects only a subset of all feature dimensions and, most importantly, different dimensions are corrupted in different samples. Nevertheless, the common approach to this problem in feature selection and kernel methods is to down-weight or eliminate entire training samples or the same dimensions of all samples. Thus, valuable signal is lost, resulting in suboptimal classification. Our goal is, therefore, to adjust the contribution of individual feature dimensions when comparing any two samples and computing their similarity. Consequently, per-sample selection of informative dimensions is directly integrated into kernel computation. The interrelated problems of learning the parameters of a kernel classifier and determining the informative components of each sample are then addressed in a joint objective function. The approach can be integrated into the learning stage of any kernel-based visual recognition problem and it does not affect the computational performance in the retrieval phase. Experiments on diverse challenges of action recognition in videos and indoor scene classification show the general applicability of the approach and its ability to improve learning of visual representations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Conference on Computer Vision (ICCV)

自引率

0.00%

发文量