通过大型语言模型嵌入对测试项目进行语义分析，预测人格测试的先验析因结构

IF 2 Q1 Psychology

Current research in behavioral sciences Pub Date : 2025-01-01 DOI:10.1016/j.crbeha.2025.100168

Nicola Milano, Maria Luongo, Michela Ponticorvo, Davide Marocco

{"title":"通过大型语言模型嵌入对测试项目进行语义分析，预测人格测试的先验析因结构","authors":"Nicola Milano, Maria Luongo, Michela Ponticorvo, Davide Marocco","doi":"10.1016/j.crbeha.2025.100168","DOIUrl":null,"url":null,"abstract":"<div><div>In this article, we explore the use of Large Language Models (LLMs) for predicting factor loadings in personality tests through the semantic analysis of test items. By leveraging text embeddings generated from LLMs, we evaluate the semantic similarity of test items and their alignment with hypothesized factorial structures without depending on human response data. Our methodology involves using embeddings from four different personality test to examine correlations between item semantics and their grouping in principal factors. Our results indicate that LLM-derived embeddings can effectively capture semantic similarities among test items, showing moderate to high correlation with the factorial structure produced by humans respondents in all tests, potentially serving as a valid measure of content validity for initial survey design and refinement. This approach offers valuable insights into the robustness of embedding techniques in psychological evaluations, showing a significant correlation with traditional test structures and providing a novel perspective on test item analysis.</div></div>","PeriodicalId":72746,"journal":{"name":"Current research in behavioral sciences","volume":"8 ","pages":"Article 100168"},"PeriodicalIF":2.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic analysis of test items through large language model embeddings predicts a-priori factorial structure of personality tests\",\"authors\":\"Nicola Milano, Maria Luongo, Michela Ponticorvo, Davide Marocco\",\"doi\":\"10.1016/j.crbeha.2025.100168\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In this article, we explore the use of Large Language Models (LLMs) for predicting factor loadings in personality tests through the semantic analysis of test items. By leveraging text embeddings generated from LLMs, we evaluate the semantic similarity of test items and their alignment with hypothesized factorial structures without depending on human response data. Our methodology involves using embeddings from four different personality test to examine correlations between item semantics and their grouping in principal factors. Our results indicate that LLM-derived embeddings can effectively capture semantic similarities among test items, showing moderate to high correlation with the factorial structure produced by humans respondents in all tests, potentially serving as a valid measure of content validity for initial survey design and refinement. This approach offers valuable insights into the robustness of embedding techniques in psychological evaluations, showing a significant correlation with traditional test structures and providing a novel perspective on test item analysis.</div></div>\",\"PeriodicalId\":72746,\"journal\":{\"name\":\"Current research in behavioral sciences\",\"volume\":\"8 \",\"pages\":\"Article 100168\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current research in behavioral sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666518225000014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Psychology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current research in behavioral sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666518225000014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Psychology","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们通过对测试项目的语义分析，探讨了在人格测试中使用大语言模型（LLMs）来预测因素负荷的方法。通过利用llm生成的文本嵌入，我们评估了测试项目的语义相似性及其与假设的析因结构的一致性，而不依赖于人类的响应数据。我们的方法包括使用四种不同人格测试的嵌入来检验项目语义及其在主要因素分组之间的相关性。我们的研究结果表明，llm衍生的嵌入可以有效地捕获测试项目之间的语义相似性，在所有测试中显示出与人类受访者产生的析因结构的中度到高度相关性，可能作为初始调查设计和改进的内容效度的有效测量。这种方法为嵌入技术在心理评估中的稳健性提供了有价值的见解，显示了与传统测试结构的显著相关性，并为测试项目分析提供了一种新的视角。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Semantic analysis of test items through large language model embeddings predicts a-priori factorial structure of personality tests

In this article, we explore the use of Large Language Models (LLMs) for predicting factor loadings in personality tests through the semantic analysis of test items. By leveraging text embeddings generated from LLMs, we evaluate the semantic similarity of test items and their alignment with hypothesized factorial structures without depending on human response data. Our methodology involves using embeddings from four different personality test to examine correlations between item semantics and their grouping in principal factors. Our results indicate that LLM-derived embeddings can effectively capture semantic similarities among test items, showing moderate to high correlation with the factorial structure produced by humans respondents in all tests, potentially serving as a valid measure of content validity for initial survey design and refinement. This approach offers valuable insights into the robustness of embedding techniques in psychological evaluations, showing a significant correlation with traditional test structures and providing a novel perspective on test item analysis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Current research in behavioral sciences Behavioral Neuroscience

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

40 days