面向可扩展和可靠的语义属性规范编码:ChatGPT与改进的AC-PLT。

IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL
Diego Ramos, Sebastián Moreno, Enrique Canessa, Sergio E Chaigneau
{"title":"面向可扩展和可靠的语义属性规范编码:ChatGPT与改进的AC-PLT。","authors":"Diego Ramos, Sebastián Moreno, Enrique Canessa, Sergio E Chaigneau","doi":"10.3758/s13428-025-02838-5","DOIUrl":null,"url":null,"abstract":"<p><p>When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT's superior stability and cost-effectiveness, despite ChatGPT's competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"302"},"PeriodicalIF":3.9000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards scalable and reliable coding of semantic property norms: ChatGPT vs. an improved AC-PLT.\",\"authors\":\"Diego Ramos, Sebastián Moreno, Enrique Canessa, Sergio E Chaigneau\",\"doi\":\"10.3758/s13428-025-02838-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT's superior stability and cost-effectiveness, despite ChatGPT's competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.</p>\",\"PeriodicalId\":8717,\"journal\":{\"name\":\"Behavior Research Methods\",\"volume\":\"57 11\",\"pages\":\"302\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Behavior Research Methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3758/s13428-025-02838-5\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02838-5","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

摘要

当使用属性列表任务(Property Listing Task, PLT)收集一组概念(概念属性规范,cpn)的语义内容时,将原始属性编码为标准化标签会带来重大挑战。在这项工作中,我们通过加强促进编码过程的财产清单任务辅助编码(AC-PLT)框架来解决这些挑战。目前的工作是通过评估文本清洗、嵌入模型(如Word2Vec、E5、LaBSE)和分类方法(如kNN、SVM、XGBoost)的组合来优化AC-PLT的烧烧研究。结果表明,使用E5嵌入模型和kNN分类进行归一化的准确率最高,CPN27和CPN120数据集的top-1测试准确率分别为0.523和0.608,优于原始AC-PLT基线。与ChatGPT(微调和一次性)相比,AC-PLT的稳定性和成本效益更高,尽管ChatGPT在某些情况下具有竞争力。改进的AC-PLT框架为手动编码挑战提供了可扩展的高效解决方案,减少了可变性和时间限制。未来的工作将探索其作为人类编码员推荐系统的作用,进一步提高其在认知心理学和心理语言学研究中的实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards scalable and reliable coding of semantic property norms: ChatGPT vs. an improved AC-PLT.

When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT's superior stability and cost-effectiveness, despite ChatGPT's competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.30
自引率
9.30%
发文量
266
期刊介绍: Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信