Enhancing Multimedia Imbalanced Concept Detection Using VIMP in Random Forests.

Saad Sadiq, Yilin Yan, Mei-Ling Shyu, Shu-Ching Chen, Hemant Ishwaran
{"title":"Enhancing Multimedia Imbalanced Concept Detection Using VIMP in Random Forests.","authors":"Saad Sadiq,&nbsp;Yilin Yan,&nbsp;Mei-Ling Shyu,&nbsp;Shu-Ching Chen,&nbsp;Hemant Ishwaran","doi":"10.1109/IRI.2016.87","DOIUrl":null,"url":null,"abstract":"<p><p>Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.</p>","PeriodicalId":89460,"journal":{"name":"Proceedings of the ... IEEE International Conference on Information Reuse and Integration. IEEE International Conference on Information Reuse and Integration","volume":"2016 ","pages":"601-608"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/IRI.2016.87","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... IEEE International Conference on Information Reuse and Integration. IEEE International Conference on Information Reuse and Integration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2016.87","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2016/12/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.

Abstract Image

Abstract Image

Abstract Image

随机森林中VIMP增强多媒体不平衡概念检测。
社交媒体和云存储的最新发展导致多媒体数据量呈指数级增长,这增加了管理、存储、索引和从这些大数据中检索信息的复杂性。目前许多基于内容的概念检测方法在成功弥合语义差距方面存在滞后。为了解决这一问题,提出了一种基于变量重要性(VIMP)的多变量回归的多阶段随机森林框架来生成预测变量。通过微调森林并显著减少预测变量,当感兴趣的概念很少且不平衡时,即与其他高级概念很少协作时,评估概念检测分数。使用经典的多变量统计,使用其他坐标估计一个坐标的值使协变量标准化,它取决于相关的方差而不是平均值。因此,消除了对正态分布数据的条件依赖。实验结果表明,该框架在平均精度(MAP)值方面优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信