MI2LS: multi-instance learning from multiple informationsources

Dan Zhang, Jingrui He, Richard D. Lawrence
{"title":"MI2LS: multi-instance learning from multiple informationsources","authors":"Dan Zhang, Jingrui He, Richard D. Lawrence","doi":"10.1145/2487575.2487651","DOIUrl":null,"url":null,"abstract":"In Multiple Instance Learning (MIL), each entity is normally expressed as a set of instances. Most of the current MIL methods only deal with the case when each instance is represented by one type of features. However, in many real world applications, entities are often described from several different information sources/views. For example, when applying MIL to image categorization, the characteristics of each image can be derived from both its RGB features and SIFT features. Previous research work has shown that, in traditional learning methods, leveraging the consistencies between different information sources could improve the classification performance drastically. Out of a similar motivation, to incorporate the consistencies between different information sources into MIL, we propose a novel research framework -- Multi-Instance Learning from Multiple Information Sources (MI2LS). Based on this framework, an algorithm -- Fast MI2LS (FMI2LS) is designed, which combines Concave-Convex Constraint Programming (CCCP) method and an adapte- d Stoachastic Gradient Descent (SGD) method. Some theoretical analysis on the optimality of the adapted SGD method and the generalized error bound of the formulation are given based on the proposed method. Experimental results on document classification and a novel application -- Insider Threat Detection (ITD), clearly demonstrate the superior performance of the proposed method over state-of-the-art MIL methods.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2487575.2487651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

Abstract

In Multiple Instance Learning (MIL), each entity is normally expressed as a set of instances. Most of the current MIL methods only deal with the case when each instance is represented by one type of features. However, in many real world applications, entities are often described from several different information sources/views. For example, when applying MIL to image categorization, the characteristics of each image can be derived from both its RGB features and SIFT features. Previous research work has shown that, in traditional learning methods, leveraging the consistencies between different information sources could improve the classification performance drastically. Out of a similar motivation, to incorporate the consistencies between different information sources into MIL, we propose a novel research framework -- Multi-Instance Learning from Multiple Information Sources (MI2LS). Based on this framework, an algorithm -- Fast MI2LS (FMI2LS) is designed, which combines Concave-Convex Constraint Programming (CCCP) method and an adapte- d Stoachastic Gradient Descent (SGD) method. Some theoretical analysis on the optimality of the adapted SGD method and the generalized error bound of the formulation are given based on the proposed method. Experimental results on document classification and a novel application -- Insider Threat Detection (ITD), clearly demonstrate the superior performance of the proposed method over state-of-the-art MIL methods.
MI2LS:从多个信息源进行多实例学习
在多实例学习(MIL)中,每个实体通常表示为一组实例。当前大多数MIL方法只处理每个实例由一种类型的特征表示的情况。然而,在许多现实世界的应用程序中,实体通常是从几个不同的信息源/视图描述的。例如,在将MIL应用于图像分类时,可以同时从图像的RGB特征和SIFT特征中获得图像的特征。以往的研究表明,在传统的学习方法中,利用不同信息源之间的一致性可以大大提高分类性能。出于类似的动机,为了将不同信息源之间的一致性纳入MIL,我们提出了一个新的研究框架——多信息源的多实例学习(MI2LS)。基于该框架,设计了一种将凹凸约束规划(CCCP)方法与自适应随机梯度下降(SGD)方法相结合的快速MI2LS (FMI2LS)算法。在此基础上,对该方法的最优性进行了理论分析,并给出了公式的广义误差界。在文档分类和一个新的应用——内部威胁检测(ITD)上的实验结果清楚地表明,该方法比最先进的MIL方法性能优越。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信