使用混合颜色分析和语义关键字结构优化人工智能生成的图像元数据

IF 4.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Egyptian Informatics Journal Pub Date : 2025-09-01 DOI:10.1016/j.eij.2025.100775

Akara Thammastitkul

{"title":"使用混合颜色分析和语义关键字结构优化人工智能生成的图像元数据","authors":"Akara Thammastitkul","doi":"10.1016/j.eij.2025.100775","DOIUrl":null,"url":null,"abstract":"<div><div>Effective metadata optimization is crucial for improving the retrieval and classification of AI-generated images, with color playing a significant role in visual perception and searchability. This study proposes a hybrid metadata optimization framework integrating color-based feature extraction (K-Means clustering and Saliency Detection) with semantic keyword structuring to enhance metadata accuracy and keyword relevance. By combining global color distributions, subject-focused visual attributes, and AI-driven contextual analysis, the proposed method ensures structured and comprehensive image content representation. The methodology comprises three primary stages: (1) Hybrid Color Extraction, (2) AI-based Keyword Generation, and (3) Structured Keyword Optimization. The hybrid extraction process initially employs K-Means clustering to identify globally dominant colors, followed by Saliency Detection to highlight subject-specific hues. Extracted colors are then mapped to descriptive keywords, complemented by context-based keywords generated through an AI captioning model. The final keyword optimization phase systematically categorizes these terms into subject-based, color-based, and descriptive-emotional keywords. The effectiveness of the proposed approach is quantitatively evaluated using several performance metrics, including precision, recall, F1-score, false positive rate, top-10 retrieval accuracy, cosine similarity, Jaccard similarity, and coverage score. Experimental results demonstrate that the proposed framework achieves a precision of 92.10%, significantly enhancing retrieval accuracy and keyword structuring compared to conventional approaches and outperforming state-of-the-art baseline methods, including the Google Cloud Vision API. This research provides a scalable and efficient metadata enrichment solution applicable to digital libraries, image search engines, and content management systems, ensuring accurate, structured, and contextually relevant metadata for effective image retrieval.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"31 ","pages":"Article 100775"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing AI-generated image metadata with hybrid color analysis and semantic keyword structuring\",\"authors\":\"Akara Thammastitkul\",\"doi\":\"10.1016/j.eij.2025.100775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Effective metadata optimization is crucial for improving the retrieval and classification of AI-generated images, with color playing a significant role in visual perception and searchability. This study proposes a hybrid metadata optimization framework integrating color-based feature extraction (K-Means clustering and Saliency Detection) with semantic keyword structuring to enhance metadata accuracy and keyword relevance. By combining global color distributions, subject-focused visual attributes, and AI-driven contextual analysis, the proposed method ensures structured and comprehensive image content representation. The methodology comprises three primary stages: (1) Hybrid Color Extraction, (2) AI-based Keyword Generation, and (3) Structured Keyword Optimization. The hybrid extraction process initially employs K-Means clustering to identify globally dominant colors, followed by Saliency Detection to highlight subject-specific hues. Extracted colors are then mapped to descriptive keywords, complemented by context-based keywords generated through an AI captioning model. The final keyword optimization phase systematically categorizes these terms into subject-based, color-based, and descriptive-emotional keywords. The effectiveness of the proposed approach is quantitatively evaluated using several performance metrics, including precision, recall, F1-score, false positive rate, top-10 retrieval accuracy, cosine similarity, Jaccard similarity, and coverage score. Experimental results demonstrate that the proposed framework achieves a precision of 92.10%, significantly enhancing retrieval accuracy and keyword structuring compared to conventional approaches and outperforming state-of-the-art baseline methods, including the Google Cloud Vision API. This research provides a scalable and efficient metadata enrichment solution applicable to digital libraries, image search engines, and content management systems, ensuring accurate, structured, and contextually relevant metadata for effective image retrieval.</div></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":\"31 \",\"pages\":\"Article 100775\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866525001689\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866525001689","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

有效的元数据优化对于提高人工智能生成图像的检索和分类至关重要，其中颜色在视觉感知和可搜索性中起着重要作用。本研究提出了一种基于颜色的特征提取（K-Means聚类和显著性检测）与语义关键字结构相结合的混合元数据优化框架，以提高元数据的准确性和关键字的相关性。该方法结合了全局色彩分布、以主题为中心的视觉属性和人工智能驱动的上下文分析，确保了结构化和全面的图像内容表示。该方法包括三个主要阶段：(1)混合颜色提取；(2)基于人工智能的关键字生成；(3)结构化关键字优化。混合提取过程最初采用K-Means聚类来识别全局主色调，然后采用显著性检测来突出显示主题特定的色调。然后将提取的颜色映射到描述性关键字，并通过AI字幕模型生成基于上下文的关键字。最后的关键字优化阶段系统地将这些术语分类为基于主题、基于颜色和描述情感的关键字。该方法的有效性使用几个性能指标进行定量评估，包括精度、召回率、f1得分、假阳性率、前10名检索准确率、余弦相似度、Jaccard相似度和覆盖率得分。实验结果表明，该框架的检索精度为92.10%，与传统方法相比，检索精度和关键词结构显著提高，优于谷歌云视觉API等最新基线方法。本研究提供了一种可扩展且高效的元数据丰富解决方案，适用于数字图书馆、图像搜索引擎和内容管理系统，确保准确、结构化和上下文相关的元数据用于有效的图像检索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimizing AI-generated image metadata with hybrid color analysis and semantic keyword structuring

Effective metadata optimization is crucial for improving the retrieval and classification of AI-generated images, with color playing a significant role in visual perception and searchability. This study proposes a hybrid metadata optimization framework integrating color-based feature extraction (K-Means clustering and Saliency Detection) with semantic keyword structuring to enhance metadata accuracy and keyword relevance. By combining global color distributions, subject-focused visual attributes, and AI-driven contextual analysis, the proposed method ensures structured and comprehensive image content representation. The methodology comprises three primary stages: (1) Hybrid Color Extraction, (2) AI-based Keyword Generation, and (3) Structured Keyword Optimization. The hybrid extraction process initially employs K-Means clustering to identify globally dominant colors, followed by Saliency Detection to highlight subject-specific hues. Extracted colors are then mapped to descriptive keywords, complemented by context-based keywords generated through an AI captioning model. The final keyword optimization phase systematically categorizes these terms into subject-based, color-based, and descriptive-emotional keywords. The effectiveness of the proposed approach is quantitatively evaluated using several performance metrics, including precision, recall, F1-score, false positive rate, top-10 retrieval accuracy, cosine similarity, Jaccard similarity, and coverage score. Experimental results demonstrate that the proposed framework achieves a precision of 92.10%, significantly enhancing retrieval accuracy and keyword structuring compared to conventional approaches and outperforming state-of-the-art baseline methods, including the Google Cloud Vision API. This research provides a scalable and efficient metadata enrichment solution applicable to digital libraries, image search engines, and content management systems, ensuring accurate, structured, and contextually relevant metadata for effective image retrieval.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research

CiteScore

11.10

自引率

1.90%

发文量

审稿时长

110 days

期刊介绍： The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.