{"title":"Integrating machine learning algorithms for robust content-based image retrieval","authors":"Maher Alrahhal, K. P. Supreethi","doi":"10.1007/s41870-024-02169-2","DOIUrl":null,"url":null,"abstract":"<p>This study introduces a robust framework for enhancing Content-Based Image Retrieval (CBIR) systems through the integration of supervised and unsupervised machine learning algorithms. Supervised learning algorithms, such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), and ensemble methods like Bagging and AdaBoost, are used with unsupervised learning techniques, including K-Means and K-Medoids clustering to improve the performance of CBIR. The core of the framework leverages advanced feature extraction methods, specifically ResNet-HOG Visual Word Fusion (RVWF) and ResNet-HOG Feature Fusion (RHFF), which utilize ResNet-50 for capturing high-level semantic information and Histogram of Oriented Gradients (HOG) for detailed texture analysis. A comparison was made between the similarity-based CBIR (standalone CBIR), classification-based CBIR, and clustering-based CBIR methods. The findings reveal that classification-based CBIR methods are superior to standalone and clustering-based CBIR methods in terms of retrieval accuracy and semantic interpretation. The proposed methods outperformed the state-of-the-art methods for different databases used in this study. The proposed frameworks demonstrated superior performance across multiple databases, including VisTex, Brodatz, Corel 10K, and Corel 1K. In the VisTex database, clustering using K-Medoids-based RVWF increased performance from 98.75% to 99.52%, while classification methods like Linear Discriminant or Bagging-based RVWF achieved 100% accuracy. Similarly, in the Brodatz database, K-Medoids-based RVWF clustering improved accuracy from 97.62% to 99.62%, with classification methods such as AdaBoost or Bagging-based RVWF reaching up to 100% accuracy. For the Corel 1K and Corel 10K databases, K-Medoids-based RVWF clustering enhanced results to 95.61% and 99.20% for RVW, respectively, while classification methods further increased accuracy to 98.20% for Corel 1K and 100% for Corel 10K. These results show that combining advanced feature extraction with machine learning algorithms can improve the performance of CBIR systems. CBIR based on clustering proved to outperform standalone CBIR systems, while classification-based CBIR systems offered the best results, making them the most suitable for accurate image retrieval.</p>","PeriodicalId":14138,"journal":{"name":"International Journal of Information Technology","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41870-024-02169-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study introduces a robust framework for enhancing Content-Based Image Retrieval (CBIR) systems through the integration of supervised and unsupervised machine learning algorithms. Supervised learning algorithms, such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), and ensemble methods like Bagging and AdaBoost, are used with unsupervised learning techniques, including K-Means and K-Medoids clustering to improve the performance of CBIR. The core of the framework leverages advanced feature extraction methods, specifically ResNet-HOG Visual Word Fusion (RVWF) and ResNet-HOG Feature Fusion (RHFF), which utilize ResNet-50 for capturing high-level semantic information and Histogram of Oriented Gradients (HOG) for detailed texture analysis. A comparison was made between the similarity-based CBIR (standalone CBIR), classification-based CBIR, and clustering-based CBIR methods. The findings reveal that classification-based CBIR methods are superior to standalone and clustering-based CBIR methods in terms of retrieval accuracy and semantic interpretation. The proposed methods outperformed the state-of-the-art methods for different databases used in this study. The proposed frameworks demonstrated superior performance across multiple databases, including VisTex, Brodatz, Corel 10K, and Corel 1K. In the VisTex database, clustering using K-Medoids-based RVWF increased performance from 98.75% to 99.52%, while classification methods like Linear Discriminant or Bagging-based RVWF achieved 100% accuracy. Similarly, in the Brodatz database, K-Medoids-based RVWF clustering improved accuracy from 97.62% to 99.62%, with classification methods such as AdaBoost or Bagging-based RVWF reaching up to 100% accuracy. For the Corel 1K and Corel 10K databases, K-Medoids-based RVWF clustering enhanced results to 95.61% and 99.20% for RVW, respectively, while classification methods further increased accuracy to 98.20% for Corel 1K and 100% for Corel 10K. These results show that combining advanced feature extraction with machine learning algorithms can improve the performance of CBIR systems. CBIR based on clustering proved to outperform standalone CBIR systems, while classification-based CBIR systems offered the best results, making them the most suitable for accurate image retrieval.