{"title":"使用大型语言模型自动化化学感觉创造力评估","authors":"Qian Janice Wang , Robert Pellegrino","doi":"10.1016/j.foodqual.2025.105599","DOIUrl":null,"url":null,"abstract":"<div><div>Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”</div><div>Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (<em>r</em> = 0.89), surpassing that of Strict GPT (<em>r</em> = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.</div></div>","PeriodicalId":322,"journal":{"name":"Food Quality and Preference","volume":"132 ","pages":"Article 105599"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automating chemosensory creativity assessment with large language models\",\"authors\":\"Qian Janice Wang , Robert Pellegrino\",\"doi\":\"10.1016/j.foodqual.2025.105599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”</div><div>Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (<em>r</em> = 0.89), surpassing that of Strict GPT (<em>r</em> = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.</div></div>\",\"PeriodicalId\":322,\"journal\":{\"name\":\"Food Quality and Preference\",\"volume\":\"132 \",\"pages\":\"Article 105599\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Food Quality and Preference\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950329325001740\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Food Quality and Preference","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950329325001740","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Automating chemosensory creativity assessment with large language models
Chemosensory creativity, the ability to innovate using taste and smell, is a crucial yet understudied aspect of human ingenuity. This study explores the potential of large language models (LLMs), specifically GPT-4, to assess creativity in the context of flavour pairings. Leveraging a novel chemosensory creativity test inspired by the Alternative Uses Task, 200 UK-based participants generated flavour pairings across four food categories. Subsequently, these pairings were rated for creativity, deliciousness, and surprise by human participants and two GPT-4 configurations: a deterministic (low-randomness) model termed “Strict GPT” and a stochastic (high-randomness) model termed “Flexible GPT.”
Findings reveal a strong correlation between human and Flexible GPT ratings of creativity (r = 0.89), surpassing that of Strict GPT (r = 0.71). Both humans and GPT models relied heavily on novelty (operationalised as surprise) rather than functionality (operationalised by deliciousness) as a determinant of creativity. However, GPT ratings exhibited a stronger emphasis on novelty and higher variability. While GPT-4 demonstrated strong potential for approximating human assessments in the UK context, differences emerged, particularly for rare flavour pairings where human ratings and model predictions had less alignment. These findings demonstrate the feasibility of using LLMs to automate creativity assessments in food-related domains by approximating human evaluations.
期刊介绍:
Food Quality and Preference is a journal devoted to sensory, consumer and behavioural research in food and non-food products. It publishes original research, critical reviews, and short communications in sensory and consumer science, and sensometrics. In addition, the journal publishes special invited issues on important timely topics and from relevant conferences. These are aimed at bridging the gap between research and application, bringing together authors and readers in consumer and market research, sensory science, sensometrics and sensory evaluation, nutrition and food choice, as well as food research, product development and sensory quality assurance. Submissions to Food Quality and Preference are limited to papers that include some form of human measurement; papers that are limited to physical/chemical measures or the routine application of sensory, consumer or econometric analysis will not be considered unless they specifically make a novel scientific contribution in line with the journal''s coverage as outlined below.