整合机器学习算法，实现基于内容的稳健图像检索

International Journal of Information Technology Pub Date : 2024-09-07 DOI:10.1007/s41870-024-02169-2

Maher Alrahhal, K. P. Supreethi

{"title":"整合机器学习算法，实现基于内容的稳健图像检索","authors":"Maher Alrahhal, K. P. Supreethi","doi":"10.1007/s41870-024-02169-2","DOIUrl":null,"url":null,"abstract":"<p>This study introduces a robust framework for enhancing Content-Based Image Retrieval (CBIR) systems through the integration of supervised and unsupervised machine learning algorithms. Supervised learning algorithms, such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), and ensemble methods like Bagging and AdaBoost, are used with unsupervised learning techniques, including K-Means and K-Medoids clustering to improve the performance of CBIR. The core of the framework leverages advanced feature extraction methods, specifically ResNet-HOG Visual Word Fusion (RVWF) and ResNet-HOG Feature Fusion (RHFF), which utilize ResNet-50 for capturing high-level semantic information and Histogram of Oriented Gradients (HOG) for detailed texture analysis. A comparison was made between the similarity-based CBIR (standalone CBIR), classification-based CBIR, and clustering-based CBIR methods. The findings reveal that classification-based CBIR methods are superior to standalone and clustering-based CBIR methods in terms of retrieval accuracy and semantic interpretation. The proposed methods outperformed the state-of-the-art methods for different databases used in this study. The proposed frameworks demonstrated superior performance across multiple databases, including VisTex, Brodatz, Corel 10K, and Corel 1K. In the VisTex database, clustering using K-Medoids-based RVWF increased performance from 98.75% to 99.52%, while classification methods like Linear Discriminant or Bagging-based RVWF achieved 100% accuracy. Similarly, in the Brodatz database, K-Medoids-based RVWF clustering improved accuracy from 97.62% to 99.62%, with classification methods such as AdaBoost or Bagging-based RVWF reaching up to 100% accuracy. For the Corel 1K and Corel 10K databases, K-Medoids-based RVWF clustering enhanced results to 95.61% and 99.20% for RVW, respectively, while classification methods further increased accuracy to 98.20% for Corel 1K and 100% for Corel 10K. These results show that combining advanced feature extraction with machine learning algorithms can improve the performance of CBIR systems. CBIR based on clustering proved to outperform standalone CBIR systems, while classification-based CBIR systems offered the best results, making them the most suitable for accurate image retrieval.</p>","PeriodicalId":14138,"journal":{"name":"International Journal of Information Technology","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating machine learning algorithms for robust content-based image retrieval\",\"authors\":\"Maher Alrahhal, K. P. Supreethi\",\"doi\":\"10.1007/s41870-024-02169-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This study introduces a robust framework for enhancing Content-Based Image Retrieval (CBIR) systems through the integration of supervised and unsupervised machine learning algorithms. Supervised learning algorithms, such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), and ensemble methods like Bagging and AdaBoost, are used with unsupervised learning techniques, including K-Means and K-Medoids clustering to improve the performance of CBIR. The core of the framework leverages advanced feature extraction methods, specifically ResNet-HOG Visual Word Fusion (RVWF) and ResNet-HOG Feature Fusion (RHFF), which utilize ResNet-50 for capturing high-level semantic information and Histogram of Oriented Gradients (HOG) for detailed texture analysis. A comparison was made between the similarity-based CBIR (standalone CBIR), classification-based CBIR, and clustering-based CBIR methods. The findings reveal that classification-based CBIR methods are superior to standalone and clustering-based CBIR methods in terms of retrieval accuracy and semantic interpretation. The proposed methods outperformed the state-of-the-art methods for different databases used in this study. The proposed frameworks demonstrated superior performance across multiple databases, including VisTex, Brodatz, Corel 10K, and Corel 1K. In the VisTex database, clustering using K-Medoids-based RVWF increased performance from 98.75% to 99.52%, while classification methods like Linear Discriminant or Bagging-based RVWF achieved 100% accuracy. Similarly, in the Brodatz database, K-Medoids-based RVWF clustering improved accuracy from 97.62% to 99.62%, with classification methods such as AdaBoost or Bagging-based RVWF reaching up to 100% accuracy. For the Corel 1K and Corel 10K databases, K-Medoids-based RVWF clustering enhanced results to 95.61% and 99.20% for RVW, respectively, while classification methods further increased accuracy to 98.20% for Corel 1K and 100% for Corel 10K. These results show that combining advanced feature extraction with machine learning algorithms can improve the performance of CBIR systems. CBIR based on clustering proved to outperform standalone CBIR systems, while classification-based CBIR systems offered the best results, making them the most suitable for accurate image retrieval.</p>\",\"PeriodicalId\":14138,\"journal\":{\"name\":\"International Journal of Information Technology\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41870-024-02169-2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41870-024-02169-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本研究通过整合监督和非监督机器学习算法，为增强基于内容的图像检索（CBIR）系统引入了一个强大的框架。K-Nearest Neighbors (KNN)、支持向量机 (SVM)、线性判别分析 (LDA) 等监督学习算法，以及 Bagging 和 AdaBoost 等集合方法，与 K-Means 和 K-Medoids 聚类等非监督学习技术一起用于提高 CBIR 的性能。该框架的核心利用了先进的特征提取方法，特别是 ResNet-HOG 视觉单词融合 (RVWF) 和 ResNet-HOG 特征融合 (RHFF)，它们利用 ResNet-50 捕捉高级语义信息，利用直方梯度图 (HOG) 进行详细的纹理分析。对基于相似性的 CBIR（独立 CBIR）、基于分类的 CBIR 和基于聚类的 CBIR 方法进行了比较。研究结果表明，就检索准确性和语义解释而言，基于分类的 CBIR 方法优于独立的 CBIR 方法和基于聚类的 CBIR 方法。对于本研究中使用的不同数据库，所提出的方法优于最先进的方法。在 VisTex、Brodatz、Corel 10K 和 Corel 1K 等多个数据库中，所提出的框架都表现出了卓越的性能。在 VisTex 数据库中，使用基于 K-Medoids 的 RVWF 进行聚类的性能从 98.75% 提高到了 99.52%，而基于线性判别或 Bagging 的 RVWF 等分类方法则达到了 100% 的准确率。同样，在 Brodatz 数据库中，基于 K-Medoids 的 RVWF 聚类将准确率从 97.62% 提高到 99.62%，而 AdaBoost 或 Bagging-based RVWF 等分类方法的准确率则高达 100%。对于 Corel 1K 和 Corel 10K 数据库，基于 K-Medoids 的 RVWF 聚类将 RVW 的结果分别提高到 95.61% 和 99.20%，而分类方法将 Corel 1K 的准确率进一步提高到 98.20%，将 Corel 10K 的准确率提高到 100%。这些结果表明，将高级特征提取与机器学习算法相结合可以提高 CBIR 系统的性能。事实证明，基于聚类的 CBIR 性能优于独立的 CBIR 系统，而基于分类的 CBIR 系统效果最好，最适合用于精确的图像检索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Integrating machine learning algorithms for robust content-based image retrieval

查看原文本刊更多论文

Integrating machine learning algorithms for robust content-based image retrieval

This study introduces a robust framework for enhancing Content-Based Image Retrieval (CBIR) systems through the integration of supervised and unsupervised machine learning algorithms. Supervised learning algorithms, such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), and ensemble methods like Bagging and AdaBoost, are used with unsupervised learning techniques, including K-Means and K-Medoids clustering to improve the performance of CBIR. The core of the framework leverages advanced feature extraction methods, specifically ResNet-HOG Visual Word Fusion (RVWF) and ResNet-HOG Feature Fusion (RHFF), which utilize ResNet-50 for capturing high-level semantic information and Histogram of Oriented Gradients (HOG) for detailed texture analysis. A comparison was made between the similarity-based CBIR (standalone CBIR), classification-based CBIR, and clustering-based CBIR methods. The findings reveal that classification-based CBIR methods are superior to standalone and clustering-based CBIR methods in terms of retrieval accuracy and semantic interpretation. The proposed methods outperformed the state-of-the-art methods for different databases used in this study. The proposed frameworks demonstrated superior performance across multiple databases, including VisTex, Brodatz, Corel 10K, and Corel 1K. In the VisTex database, clustering using K-Medoids-based RVWF increased performance from 98.75% to 99.52%, while classification methods like Linear Discriminant or Bagging-based RVWF achieved 100% accuracy. Similarly, in the Brodatz database, K-Medoids-based RVWF clustering improved accuracy from 97.62% to 99.62%, with classification methods such as AdaBoost or Bagging-based RVWF reaching up to 100% accuracy. For the Corel 1K and Corel 10K databases, K-Medoids-based RVWF clustering enhanced results to 95.61% and 99.20% for RVW, respectively, while classification methods further increased accuracy to 98.20% for Corel 1K and 100% for Corel 10K. These results show that combining advanced feature extraction with machine learning algorithms can improve the performance of CBIR systems. CBIR based on clustering proved to outperform standalone CBIR systems, while classification-based CBIR systems offered the best results, making them the most suitable for accurate image retrieval.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Information Technology

自引率

0.00%

发文量