Utilization of generative AI for the characterization and identification of visual unknowns

Natural Language Processing Journal Pub Date : 2024-03-25 DOI:10.1016/j.nlp.2024.100064

Kara Combs , Trevor J. Bihl , Subhashini Ganapathy

{"title":"Utilization of generative AI for the characterization and identification of visual unknowns","authors":"Kara Combs , Trevor J. Bihl , Subhashini Ganapathy","doi":"10.1016/j.nlp.2024.100064","DOIUrl":null,"url":null,"abstract":"<div><p>Current state-of-the-art artificial intelligence (AI) struggles with accurate interpretation of out-of-library objects. One method proposed remedy is analogical reasoning (AR), which utilizes abductive reasoning to draw inferences on an unfamiliar scenario given knowledge about a similar familiar scenario. Currently, applications of visual AR gravitate toward analogy-formatted image problems rather than real-world computer vision data sets. This paper proposes the Image Recognition Through Analogical Reasoning Algorithm (IRTARA) and its “generative AI” version called “GIRTARA” which describes and predicts out-of-library visual objects. IRTARA characterizes the out-of-library object through a list of words called the “term frequency list”. GIRTARA uses the term frequency list to predict what the out-of-library object is. To evaluate the quality of the results of IRTARA, both quantitative and qualitative assessments are used, including a baseline to compare the automated methods with human-generated results. The accuracy of GIRTARA’s predictions is calculated through a cosine similarity analysis. This study observed that IRTARA had consistent results in the term frequency list based on the three evaluation methods for the high-quality results and GIRTARA was able to obtain up to 65% match in terms of cosine similarity when compared to the out-of-library object’s true labels.</p></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"7 ","pages":"Article 100064"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949719124000128/pdfft?md5=b907bb3498bdf74554a25eef96b3ee34&pid=1-s2.0-S2949719124000128-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719124000128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Current state-of-the-art artificial intelligence (AI) struggles with accurate interpretation of out-of-library objects. One method proposed remedy is analogical reasoning (AR), which utilizes abductive reasoning to draw inferences on an unfamiliar scenario given knowledge about a similar familiar scenario. Currently, applications of visual AR gravitate toward analogy-formatted image problems rather than real-world computer vision data sets. This paper proposes the Image Recognition Through Analogical Reasoning Algorithm (IRTARA) and its “generative AI” version called “GIRTARA” which describes and predicts out-of-library visual objects. IRTARA characterizes the out-of-library object through a list of words called the “term frequency list”. GIRTARA uses the term frequency list to predict what the out-of-library object is. To evaluate the quality of the results of IRTARA, both quantitative and qualitative assessments are used, including a baseline to compare the automated methods with human-generated results. The accuracy of GIRTARA’s predictions is calculated through a cosine similarity analysis. This study observed that IRTARA had consistent results in the term frequency list based on the three evaluation methods for the high-quality results and GIRTARA was able to obtain up to 65% match in terms of cosine similarity when compared to the out-of-library object’s true labels.

查看原文本刊更多论文

利用生成式人工智能表征和识别视觉未知因素

当前最先进的人工智能（AI）在准确解释图书馆以外的对象方面存在困难。其中一种补救方法是类比推理（AR），它利用归纳推理，根据熟悉的类似场景知识，对陌生场景进行推理。目前，视觉 AR 的应用主要针对类比格式的图像问题，而不是真实世界的计算机视觉数据集。本文提出了 "通过类比推理进行图像识别算法"（IRTARA）及其 "生成式人工智能 "版本 "GIRTARA"，用于描述和预测库外视觉对象。IRTARA 通过一个称为 "词频列表 "的词表来描述图书馆外对象的特征。GIRTARA 使用词频列表来预测馆外对象。为了评估 IRTARA 结果的质量，我们采用了定量和定性评估，包括将自动方法与人工生成的结果进行比较的基线。GIRTARA 预测的准确性是通过余弦相似性分析计算得出的。本研究观察到，基于三种评估方法，IRTARA 在术语词频列表中的高质量结果是一致的，与库外对象的真实标签相比，GIRTARA 能够获得高达 65% 的余弦相似度匹配。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Natural Language Processing Journal

自引率

0.00%

发文量