使用大型语言模型自动化化学感觉创造力评估

IF 4.9 1区农林科学 Q1 FOOD SCIENCE & TECHNOLOGY

Food Quality and Preference Pub Date : 2025-05-27 DOI:10.1016/j.foodqual.2025.105599

Qian Janice Wang , Robert Pellegrino

{"title":"使用大型语言模型自动化化学感觉创造力评估","authors":"Qian Janice Wang , Robert Pellegrino","doi":"10.1016/j.foodqual.2025.105599","DOIUrl":null,"url":null,"abstract":"<div><div>Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”</div><div>Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (<em>r</em> = 0.89), surpassing that of Strict GPT (<em>r</em> = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.</div></div>","PeriodicalId":322,"journal":{"name":"Food Quality and Preference","volume":"132 ","pages":"Article 105599"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automating chemosensory creativity assessment with large language models\",\"authors\":\"Qian Janice Wang , Robert Pellegrino\",\"doi\":\"10.1016/j.foodqual.2025.105599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”</div><div>Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (<em>r</em> = 0.89), surpassing that of Strict GPT (<em>r</em> = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.</div></div>\",\"PeriodicalId\":322,\"journal\":{\"name\":\"Food Quality and Preference\",\"volume\":\"132 \",\"pages\":\"Article 105599\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Food Quality and Preference\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950329325001740\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Quality and Preference","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950329325001740","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

化学感觉创造力，即利用味觉和嗅觉进行创新的能力，是人类创造力中一个至关重要但尚未得到充分研究的方面。本研究探索了大型语言模型（llm）的潜力，特别是GPT-4，以评估风味配对背景下的创造力。受“替代用途任务”的启发，200名英国参与者利用一项新颖的化学感官创造力测试，对四种食物的味道进行了配对。随后，这些配对被人类参与者和两种GPT-4配置评为创造力，美味和惊喜：一种确定性（低随机性）模型称为“严格GPT”，一种随机（高随机性）模型称为“灵活GPT”。研究结果显示，人类和灵活GPT的创造力评级之间存在很强的相关性（r = 0.89），超过了严格GPT （r = 0.71）。人类和GPT模型都严重依赖新颖性（作为惊喜进行操作），而不是功能性（通过美味进行操作）作为创造力的决定因素。然而，GPT评级更强调新颖性和更高的可变性。虽然GPT-4在英国环境中显示出接近人类评估的强大潜力，但差异出现了，特别是在罕见的味道配对中，人类评级和模型预测的一致性较低。这些发现表明，通过近似人类评估，使用法学硕士来自动化食品相关领域的创造力评估是可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automating chemosensory creativity assessment with large language models

Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”

Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (r = 0.89), surpassing that of Strict GPT (r = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Food Quality and Preference 工程技术-食品科技

CiteScore

10.40

自引率

15.10%

发文量

263

审稿时长

38 days

期刊介绍： Food Quality and Preference is a journal devoted to sensory, consumer and behavioural research in food and non-food products. It publishes original research, critical reviews, and short communications in sensory and consumer science, and sensometrics. In addition, the journal publishes special invited issues on important timely topics and from relevant conferences. These are aimed at bridging the gap between research and application, bringing together authors and readers in consumer and market research, sensory science, sensometrics and sensory evaluation, nutrition and food choice, as well as food research, product development and sensory quality assurance. Submissions to Food Quality and Preference are limited to papers that include some form of human measurement; papers that are limited to physical/chemical measures or the routine application of sensory, consumer or econometric analysis will not be considered unless they specifically make a novel scientific contribution in line with the journal''s coverage as outlined below.