基于本地化内容的图像检索与自学多实例学习

2009 IEEE International Conference on Data Mining Workshops Pub Date : 2009-12-06 DOI:10.1109/ICDMW.2009.105

Qifeng Qiao, P. Beling

{"title":"基于本地化内容的图像检索与自学多实例学习","authors":"Qifeng Qiao, P. Beling","doi":"10.1109/ICDMW.2009.105","DOIUrl":null,"url":null,"abstract":"There are many scenarios in which multi-instance learning problems may be difficult to solve because of a lack of correctly labeled examples for algorithm training. Labeled examples may be difficult or expensive to obtain because human effort is often needed to produce labels and because there may be limitations on the ability to collect large samples for training from a homogeneous population. In this paper, we present a technique called self-taught multiple-instance learning (STMIL) that deals with learning from a limited number of ambiguously labeled examples. STMIL uses a sparse representation for examples belonging to different classes in terms of a shared dictionary derived from the unlabeled data. This sparse representation can be optimized under the multiple instance setting to both construct high-level features and unite the data distribution. We present an optimization procedure for STMIL along with experiments on localized content-based image retrieval. Our experimental results suggest that, though it learns from a small number of labeled examples, STMIL is superior to standard algorithms in terms of computational efficiency and is at least competitive in terms of accuracy.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"226 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Localized Content Based Image Retrieval with Self-Taught Multiple Instance Learning\",\"authors\":\"Qifeng Qiao, P. Beling\",\"doi\":\"10.1109/ICDMW.2009.105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are many scenarios in which multi-instance learning problems may be difficult to solve because of a lack of correctly labeled examples for algorithm training. Labeled examples may be difficult or expensive to obtain because human effort is often needed to produce labels and because there may be limitations on the ability to collect large samples for training from a homogeneous population. In this paper, we present a technique called self-taught multiple-instance learning (STMIL) that deals with learning from a limited number of ambiguously labeled examples. STMIL uses a sparse representation for examples belonging to different classes in terms of a shared dictionary derived from the unlabeled data. This sparse representation can be optimized under the multiple instance setting to both construct high-level features and unite the data distribution. We present an optimization procedure for STMIL along with experiments on localized content-based image retrieval. Our experimental results suggest that, though it learns from a small number of labeled examples, STMIL is superior to standard algorithms in terms of computational efficiency and is at least competitive in terms of accuracy.\",\"PeriodicalId\":351078,\"journal\":{\"name\":\"2009 IEEE International Conference on Data Mining Workshops\",\"volume\":\"226 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Conference on Data Mining Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2009.105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2009.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

在许多情况下，由于缺乏用于算法训练的正确标记的示例，多实例学习问题可能难以解决。标记的例子可能很难或昂贵，因为通常需要人力来制作标签，因为从同质群体中收集大样本进行训练的能力可能受到限制。在本文中，我们提出了一种称为自学多实例学习(STMIL)的技术，该技术处理从有限数量的模糊标记示例中学习的问题。根据从未标记数据派生的共享字典，STMIL对属于不同类的示例使用稀疏表示。这种稀疏表示可以在多实例设置下进行优化，既可以构造高级特征，又可以统一数据分布。我们提出了一种基于局部内容的图像检索优化方法。我们的实验结果表明，尽管STMIL从少量标记的示例中学习，但它在计算效率方面优于标准算法，并且在准确性方面至少具有竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Localized Content Based Image Retrieval with Self-Taught Multiple Instance Learning

There are many scenarios in which multi-instance learning problems may be difficult to solve because of a lack of correctly labeled examples for algorithm training. Labeled examples may be difficult or expensive to obtain because human effort is often needed to produce labels and because there may be limitations on the ability to collect large samples for training from a homogeneous population. In this paper, we present a technique called self-taught multiple-instance learning (STMIL) that deals with learning from a limited number of ambiguously labeled examples. STMIL uses a sparse representation for examples belonging to different classes in terms of a shared dictionary derived from the unlabeled data. This sparse representation can be optimized under the multiple instance setting to both construct high-level features and unite the data distribution. We present an optimization procedure for STMIL along with experiments on localized content-based image retrieval. Our experimental results suggest that, though it learns from a small number of labeled examples, STMIL is superior to standard algorithms in terms of computational efficiency and is at least competitive in terms of accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2009 IEEE International Conference on Data Mining Workshops

自引率

0.00%

发文量