A multimodal transformer-based tool for automatic generation of concreteness ratings across languages.

Communications Psychology Pub Date : 2025-07-08 DOI:10.1038/s44271-025-00280-z

Viktor Kewenig, Jeremy I Skipper, Gabriella Vigliocco

{"title":"A multimodal transformer-based tool for automatic generation of concreteness ratings across languages.","authors":"Viktor Kewenig, Jeremy I Skipper, Gabriella Vigliocco","doi":"10.1038/s44271-025-00280-z","DOIUrl":null,"url":null,"abstract":"<p><p>We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single British words and 0.85 for multiword expressions with existing corpora of human raters. We demonstrate general applicability through successful cross-lingual generalization to an entirely unseen corpus of Estonian single- and multi-word expressions (N = 35,979), achieved via automated language detection and translation. By leveraging both visual and emotional information in context-aware language embeddings, our method effectively captures the full spectrum from concrete to abstract concepts. Our automated system offers a context sensitive, reliable alternative to traditional human ratings, eliminating the need for time-consuming and costly human rating collection. We provide an easy to access web-based interface for research to use our tool under concreteness.eu .</p>","PeriodicalId":501698,"journal":{"name":"Communications Psychology","volume":"3 1","pages":"100"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12238627/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44271-025-00280-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single British words and 0.85 for multiword expressions with existing corpora of human raters. We demonstrate general applicability through successful cross-lingual generalization to an entirely unseen corpus of Estonian single- and multi-word expressions (N = 35,979), achieved via automated language detection and translation. By leveraging both visual and emotional information in context-aware language embeddings, our method effectively captures the full spectrum from concrete to abstract concepts. Our automated system offers a context sensitive, reliable alternative to traditional human ratings, eliminating the need for time-consuming and costly human rating collection. We provide an easy to access web-based interface for research to use our tool under concreteness.eu .

Abstract Image

查看原文本刊更多论文

一个基于多模态转换器的工具，用于跨语言自动生成具体等级。

我们提出了一种自动化的方法来生成具体等级，在多种语言和表达类型中实现超越人类水平的可靠性。我们的方法结合了多模态变换和情绪微调语言模型，并在现有的人类评分者语料库中实现了单个英国单词和多单词表达的0.93和0.85的相关性。我们通过成功地跨语言泛化到完全看不见的爱沙尼亚语单词和多词表达式（N = 35,979）的语料库，通过自动语言检测和翻译实现了一般适用性。通过利用上下文感知语言嵌入中的视觉和情感信息，我们的方法有效地捕获了从具体到抽象概念的全部范围。我们的自动化系统提供了一个上下文敏感的、可靠的替代传统的人工评级，消除了耗时和昂贵的人工评级收集的需要。我们提供了一个易于访问的基于web的界面，供研究人员在具体情况下使用我们的工具。欧盟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Communications Psychology

自引率

0.00%

发文量