Viktor Kewenig, Jeremy I Skipper, Gabriella Vigliocco
{"title":"A multimodal transformer-based tool for automatic generation of concreteness ratings across languages.","authors":"Viktor Kewenig, Jeremy I Skipper, Gabriella Vigliocco","doi":"10.1038/s44271-025-00280-z","DOIUrl":null,"url":null,"abstract":"<p><p>We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single British words and 0.85 for multiword expressions with existing corpora of human raters. We demonstrate general applicability through successful cross-lingual generalization to an entirely unseen corpus of Estonian single- and multi-word expressions (N = 35,979), achieved via automated language detection and translation. By leveraging both visual and emotional information in context-aware language embeddings, our method effectively captures the full spectrum from concrete to abstract concepts. Our automated system offers a context sensitive, reliable alternative to traditional human ratings, eliminating the need for time-consuming and costly human rating collection. We provide an easy to access web-based interface for research to use our tool under concreteness.eu .</p>","PeriodicalId":501698,"journal":{"name":"Communications Psychology","volume":"3 1","pages":"100"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44271-025-00280-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present an automated method for generating concreteness ratings that achieves beyond human-level reliability across multiple languages and expression types. Our approach combines multimodal transformers with emotion-finetuned language models and achieves correlations of 0.93 for single British words and 0.85 for multiword expressions with existing corpora of human raters. We demonstrate general applicability through successful cross-lingual generalization to an entirely unseen corpus of Estonian single- and multi-word expressions (N = 35,979), achieved via automated language detection and translation. By leveraging both visual and emotional information in context-aware language embeddings, our method effectively captures the full spectrum from concrete to abstract concepts. Our automated system offers a context sensitive, reliable alternative to traditional human ratings, eliminating the need for time-consuming and costly human rating collection. We provide an easy to access web-based interface for research to use our tool under concreteness.eu .