Measuring Human and AI Values based on Generative Psychometrics with Large Language Models

arXiv - CS - Computation and Language Pub Date : 2024-09-18 DOI:arxiv-2409.12106

Haoran Ye, Yuhang Xie, Yuanyi Ren, Hanjun Fang, Xin Zhang, Guojie Song

{"title":"Measuring Human and AI Values based on Generative Psychometrics with Large Language Models","authors":"Haoran Ye, Yuhang Xie, Yuanyi Ren, Hanjun Fang, Xin Zhang, Guojie Song","doi":"arxiv-2409.12106","DOIUrl":null,"url":null,"abstract":"Human values and their measurement are long-standing interdisciplinary\ninquiry. Recent advances in AI have sparked renewed interest in this area, with\nlarge language models (LLMs) emerging as both tools and subjects of value\nmeasurement. This work introduces Generative Psychometrics for Values (GPV), an\nLLM-based, data-driven value measurement paradigm, theoretically grounded in\ntext-revealed selective perceptions. We begin by fine-tuning an LLM for\naccurate perception-level value measurement and verifying the capability of\nLLMs to parse texts into perceptions, forming the core of the GPV pipeline.\nApplying GPV to human-authored blogs, we demonstrate its stability, validity,\nand superiority over prior psychological tools. Then, extending GPV to LLM\nvalue measurement, we advance the current art with 1) a psychometric\nmethodology that measures LLM values based on their scalable and free-form\noutputs, enabling context-specific measurement; 2) a comparative analysis of\nmeasurement paradigms, indicating response biases of prior methods; and 3) an\nattempt to bridge LLM values and their safety, revealing the predictive power\nof different value systems and the impacts of various values on LLM safety.\nThrough interdisciplinary efforts, we aim to leverage AI for next-generation\npsychometrics and psychometrics for value-aligned AI.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Human values and their measurement are long-standing interdisciplinary inquiry. Recent advances in AI have sparked renewed interest in this area, with large language models (LLMs) emerging as both tools and subjects of value measurement. This work introduces Generative Psychometrics for Values (GPV), an LLM-based, data-driven value measurement paradigm, theoretically grounded in text-revealed selective perceptions. We begin by fine-tuning an LLM for accurate perception-level value measurement and verifying the capability of LLMs to parse texts into perceptions, forming the core of the GPV pipeline. Applying GPV to human-authored blogs, we demonstrate its stability, validity, and superiority over prior psychological tools. Then, extending GPV to LLM value measurement, we advance the current art with 1) a psychometric methodology that measures LLM values based on their scalable and free-form outputs, enabling context-specific measurement; 2) a comparative analysis of measurement paradigms, indicating response biases of prior methods; and 3) an attempt to bridge LLM values and their safety, revealing the predictive power of different value systems and the impacts of various values on LLM safety. Through interdisciplinary efforts, we aim to leverage AI for next-generation psychometrics and psychometrics for value-aligned AI.

查看原文本刊更多论文

基于大语言模型的生成心理测量学衡量人类和人工智能价值

人类价值观及其测量是一项长期的跨学科研究。人工智能的最新进展再次激发了人们对这一领域的兴趣，大型语言模型（LLM）成为价值测量的工具和主体。本作品介绍了价值生成心理测量法（GPV），这是一种基于 LLM 的数据驱动型价值测量范式，其理论基础是文本揭示的选择性感知。我们首先对 LLM 进行了微调，以实现准确的感知级价值测量，并验证了 LLM 将文本解析为感知的能力，从而形成了 GPV 管道的核心。我们将 GPV 应用于人类撰写的博客，证明了它的稳定性、有效性以及优于现有心理学工具的优势。然后，我们将 GPV 扩展到 LLM 价值测量中，通过 1) 基于可扩展和自由形式输出测量 LLM 价值的心理测量方法，实现了针对具体语境的测量；2) 测量范式的比较分析，指出了先前方法的响应偏差；3) 尝试将 LLM 价值与其安全性联系起来，揭示了不同价值体系的预测能力以及各种价值对 LLM 安全性的影响。通过跨学科的努力，我们的目标是利用人工智能促进下一代心理测量学和心理测量学促进价值一致的人工智能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Computation and Language

自引率

0.00%

发文量