Comparison of similarity measures in HSV quantization for CBIR

2017 International Conference on Data and Software Engineering (ICoDSE) Pub Date : 2017-11-01 DOI:10.1109/ICODSE.2017.8285854

Jasman Pardede, B. Sitohang, Saiful Akbar, M. L. Khodra

{"title":"Comparison of similarity measures in HSV quantization for CBIR","authors":"Jasman Pardede, B. Sitohang, Saiful Akbar, M. L. Khodra","doi":"10.1109/ICODSE.2017.8285854","DOIUrl":null,"url":null,"abstract":"Researchers implemented various similarity measure for CBIR using HSV Quantization. Implemented similarity measures on this study is Euclidean Distance, Cramer-von Mises Divergence, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, Pearson Correlation Coefficient, and Mahalanobis Distance. The purpose of study is to measure the performance of image retrieval of the CBIR system using HSV Quantization for each of the similarity measures. The performance of similarity measures are evaluated based on precision, recall, and F-measure value that obtained from test results performed on the Wang dataset. Similarity measures were performed on each of the categories (Africa, Beaches, Building, Bus, Dinosaur, Elephant, Flower, Horses, Mountain, and Food) that has 100 images of each its category. The test results showed that the highest precision valued are 100% provided with Jeffrey Divergence on Dinosaur category. The best average precision value of all categories is provided with Jeffrey Divergence, i.e. 87.298%. In generally, the best average precision value is Dinosaur category (Euclidean Distance, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, and Pearson Correlation Coefficient). The next of average precision value is on Flower category for Cramer-von Mises Divergence, and the last category is on Bus category that provided with Mahalanobis Distance. The highest average recall valued is 92% on Horses category that established to Cosine Similarity. The best average recall valued for all categories is on Manhattan Distance, i.e. 38.700%. In generally, the best average recall valued is on Horses category that provided with Cramer-von Mises Divergence, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, Pearson Correlation Coefficient, and Mahalanobis Distance. The best average recall value of the Euclidean Distance is Africa category. The highest F-measure value is 87.255% on Horses category provided with Cosine Similarity. The experiment result showed that the highest F-measure valued is always on Horses category. The highest F-measure value in general provided with Manhattan Distance (Africa, Beaches, Building, Bus, Dinosaur, Elephant, Flower, Mountain, and Food), while the highest F-measure valued of Horses category provided with Cosine Similarity.","PeriodicalId":366005,"journal":{"name":"2017 International Conference on Data and Software Engineering (ICoDSE)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Data and Software Engineering (ICoDSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICODSE.2017.8285854","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Researchers implemented various similarity measure for CBIR using HSV Quantization. Implemented similarity measures on this study is Euclidean Distance, Cramer-von Mises Divergence, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, Pearson Correlation Coefficient, and Mahalanobis Distance. The purpose of study is to measure the performance of image retrieval of the CBIR system using HSV Quantization for each of the similarity measures. The performance of similarity measures are evaluated based on precision, recall, and F-measure value that obtained from test results performed on the Wang dataset. Similarity measures were performed on each of the categories (Africa, Beaches, Building, Bus, Dinosaur, Elephant, Flower, Horses, Mountain, and Food) that has 100 images of each its category. The test results showed that the highest precision valued are 100% provided with Jeffrey Divergence on Dinosaur category. The best average precision value of all categories is provided with Jeffrey Divergence, i.e. 87.298%. In generally, the best average precision value is Dinosaur category (Euclidean Distance, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, and Pearson Correlation Coefficient). The next of average precision value is on Flower category for Cramer-von Mises Divergence, and the last category is on Bus category that provided with Mahalanobis Distance. The highest average recall valued is 92% on Horses category that established to Cosine Similarity. The best average recall valued for all categories is on Manhattan Distance, i.e. 38.700%. In generally, the best average recall valued is on Horses category that provided with Cramer-von Mises Divergence, Manhattan Distance, Cosine Similarity, Chi-Square Dissimilarity, Jeffrey Divergence, Pearson Correlation Coefficient, and Mahalanobis Distance. The best average recall value of the Euclidean Distance is Africa category. The highest F-measure value is 87.255% on Horses category provided with Cosine Similarity. The experiment result showed that the highest F-measure valued is always on Horses category. The highest F-measure value in general provided with Manhattan Distance (Africa, Beaches, Building, Bus, Dinosaur, Elephant, Flower, Mountain, and Food), while the highest F-measure valued of Horses category provided with Cosine Similarity.

查看原文本刊更多论文

CBIR中HSV量化相似性度量的比较

研究人员利用HSV量化实现了不同的相似性度量。在本研究中实施的相似性度量是欧几里得距离、克拉默-冯·米塞斯散度、曼哈顿距离、余弦相似度、卡方不相似度、杰弗里散度、Pearson相关系数和马氏距离。本研究的目的是利用HSV量化方法对各相似性度量值进行图像检索，以衡量CBIR系统的图像检索性能。相似性度量的性能基于精度、召回率和从Wang数据集上执行的测试结果中获得的f度量值进行评估。对每个类别(非洲、海滩、建筑、公共汽车、恐龙、大象、花、马、山和食物)进行相似性测量，每个类别有100张图像。测试结果表明，Jeffrey Divergence在恐龙类别上提供的最高精度值为100%。Jeffrey Divergence给出了各类别的最佳平均精度值，为87.298%。一般来说，平均精度值最好的是恐龙类(欧几里得距离、曼哈顿距离、余弦相似度、卡方不相似度、杰弗里散度和Pearson相关系数)。下一个平均精度值是基于克拉默-冯·米塞斯散度的花类，最后一个平均精度值是基于马氏距离的巴士类。在基于余弦相似度的马类中，最高的平均召回值是92%。所有类别的最佳平均召回值是曼哈顿距离，即38.700%。总体而言，具有克拉默-冯·米塞斯散度、曼哈顿距离、余弦相似度、卡方不相似度、杰弗里散度、皮尔逊相关系数和马哈拉诺比斯距离的马类平均召回值最佳。欧几里得距离的最佳平均回忆值是非洲类别。在具有余弦相似度的马类中，f测量值最高为87.255%。实验结果表明，f值最高的总是马类。一般来说，最高的f值提供了曼哈顿距离(非洲，海滩，建筑物，公共汽车，恐龙，大象，花，山和食物)，而马类的最高f值提供了余弦相似度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Conference on Data and Software Engineering (ICoDSE)

自引率

0.00%

发文量