Selection of relevant information to improve Image Classification using Bag of Visual Words

Q4 Computer Science
Eduardo Fidalgo Fernández
{"title":"Selection of relevant information to improve Image Classification using Bag of Visual Words","authors":"Eduardo Fidalgo Fernández","doi":"10.5565/REV/ELCVIA.1102","DOIUrl":null,"url":null,"abstract":"One of the main challenges in computer vision is image classification. Nowadays the number of images increases exponentially every day; therefore, it is important to classify them in a reliable way. The conventional image classification pipeline usually consists on extracting local image features, encoding them as a feature vector and classify them using a previously created model. With regards to feature codification, the Bag of Words model and its extensions, such as pyramid matching and weighted schemes, have achieved quite good results and have become the state of the art methods. The process as mentioned above is not perfect and computers, as well as humans, may make mistakes in any of the steps, causing a performance drop in classification. Some of the primary sources of error on large-scale image classification are the presence of multiple objects in the image, small or very thin objects, incorrect annotations or fine-grained recognition tasks among others. Based on those problems and the steps of a typical image classification pipeline, the motivation of this PhD thesis was to provide some guidelines to improve the quality of the extracted features to obtain better classification results. The contributions of the PhD thesis demonstrated how a good feature selection can contribute to improving the fine-grained classification, and that there would even be no need to have a big training data set to learn the key features of each class and to predict with good results.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"12 1","pages":"5-8"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Letters on Computer Vision and Image Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5565/REV/ELCVIA.1102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1

Abstract

One of the main challenges in computer vision is image classification. Nowadays the number of images increases exponentially every day; therefore, it is important to classify them in a reliable way. The conventional image classification pipeline usually consists on extracting local image features, encoding them as a feature vector and classify them using a previously created model. With regards to feature codification, the Bag of Words model and its extensions, such as pyramid matching and weighted schemes, have achieved quite good results and have become the state of the art methods. The process as mentioned above is not perfect and computers, as well as humans, may make mistakes in any of the steps, causing a performance drop in classification. Some of the primary sources of error on large-scale image classification are the presence of multiple objects in the image, small or very thin objects, incorrect annotations or fine-grained recognition tasks among others. Based on those problems and the steps of a typical image classification pipeline, the motivation of this PhD thesis was to provide some guidelines to improve the quality of the extracted features to obtain better classification results. The contributions of the PhD thesis demonstrated how a good feature selection can contribute to improving the fine-grained classification, and that there would even be no need to have a big training data set to learn the key features of each class and to predict with good results.
利用视觉词袋选择相关信息改进图像分类
计算机视觉的主要挑战之一是图像分类。如今,图像的数量每天都呈指数级增长;因此,以可靠的方式对它们进行分类是很重要的。传统的图像分类管道通常包括提取局部图像特征,将其编码为特征向量,并使用先前创建的模型对其进行分类。在特征编码方面,Bag of Words模型及其扩展,如金字塔匹配和加权方案,已经取得了相当好的效果,成为最先进的方法。上面提到的过程并不完美,计算机和人类一样,可能在任何一个步骤中犯错误,导致分类性能下降。大规模图像分类的一些主要错误来源是图像中存在多个对象,小或非常薄的对象,不正确的注释或细粒度识别任务等。基于这些问题和典型图像分类流水线的步骤,本博士论文的动机是为提高提取特征的质量以获得更好的分类结果提供一些指导。博士论文的贡献证明了良好的特征选择如何有助于改进细粒度分类,甚至不需要有一个大的训练数据集来学习每个类的关键特征并获得良好的预测结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Electronic Letters on Computer Vision and Image Analysis
Electronic Letters on Computer Vision and Image Analysis Computer Science-Computer Vision and Pattern Recognition
CiteScore
2.50
自引率
0.00%
发文量
19
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信