Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge
{"title":"评估高级 GPT 推理能力的推理和价值排列测试","authors":"Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge","doi":"10.1145/3670691","DOIUrl":null,"url":null,"abstract":"<p>In response to diverse perspectives on <i>Artificial General Intelligence</i> (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of <i>Artificial Intelligence</i> (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial <i>Generative Pre-trained Transformers</i> (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"43 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning\",\"authors\":\"Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge\",\"doi\":\"10.1145/3670691\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In response to diverse perspectives on <i>Artificial General Intelligence</i> (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of <i>Artificial Intelligence</i> (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial <i>Generative Pre-trained Transformers</i> (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.</p>\",\"PeriodicalId\":48574,\"journal\":{\"name\":\"ACM Transactions on Interactive Intelligent Systems\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Interactive Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3670691\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Interactive Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3670691","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning
In response to diverse perspectives on Artificial General Intelligence (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of Artificial Intelligence (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial Generative Pre-trained Transformers (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.
期刊介绍:
The ACM Transactions on Interactive Intelligent Systems (TiiS) publishes papers on research concerning the design, realization, or evaluation of interactive systems that incorporate some form of machine intelligence. TIIS articles come from a wide range of research areas and communities. An article can take any of several complementary views of interactive intelligent systems, focusing on:
the intelligent technology,
the interaction of users with the system, or
both aspects at once.