{"title":"从语言中学习颜色。","authors":"Qiawen Liu, Jeroen van Paridon, Gary Lupyan","doi":"10.1038/s44271-025-00230-9","DOIUrl":null,"url":null,"abstract":"<p><p>Certain colors are strongly associated with certain adjectives (e.g. red is hot, blue is cold). Some of these associations are grounded in visual experiences such as seeing glowing red embers. Surprisingly, despite having no visual experience, many congenitally blind people show very similar color associations which are likely learned through language. We show that these associations are indeed embedded in the statistical structure of language. We apply a projection method to word embeddings trained on corpora of spoken and written language to identify color-adjective associations as they are represented in English. These projections were predictive of color-adjective associations reported by blind and sighted English speakers. The most predictive projections were generated by embeddings derived from a corpus of fiction, which outperformed even the state-of-the-art large language model, GPT-4. By augmenting the training corpora in various ways we discover the types of sentences most responsible for conveying the color-adjective associations to the models. We find that word embedding models learn these associations from indirect (second-order) co-occurrences, and that when prompted, people are able to identify some of the words that are most informative for associating colors with specific adjectives. Learning through linguistic co-occurrences is one way word meanings can be continually aligned across language users despite large variations in perceptual experience.</p>","PeriodicalId":501698,"journal":{"name":"Communications Psychology","volume":"3 1","pages":"60"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11997174/pdf/","citationCount":"0","resultStr":"{\"title\":\"Learning about color from language.\",\"authors\":\"Qiawen Liu, Jeroen van Paridon, Gary Lupyan\",\"doi\":\"10.1038/s44271-025-00230-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Certain colors are strongly associated with certain adjectives (e.g. red is hot, blue is cold). Some of these associations are grounded in visual experiences such as seeing glowing red embers. Surprisingly, despite having no visual experience, many congenitally blind people show very similar color associations which are likely learned through language. We show that these associations are indeed embedded in the statistical structure of language. We apply a projection method to word embeddings trained on corpora of spoken and written language to identify color-adjective associations as they are represented in English. These projections were predictive of color-adjective associations reported by blind and sighted English speakers. The most predictive projections were generated by embeddings derived from a corpus of fiction, which outperformed even the state-of-the-art large language model, GPT-4. By augmenting the training corpora in various ways we discover the types of sentences most responsible for conveying the color-adjective associations to the models. We find that word embedding models learn these associations from indirect (second-order) co-occurrences, and that when prompted, people are able to identify some of the words that are most informative for associating colors with specific adjectives. Learning through linguistic co-occurrences is one way word meanings can be continually aligned across language users despite large variations in perceptual experience.</p>\",\"PeriodicalId\":501698,\"journal\":{\"name\":\"Communications Psychology\",\"volume\":\"3 1\",\"pages\":\"60\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11997174/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications Psychology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1038/s44271-025-00230-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44271-025-00230-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Certain colors are strongly associated with certain adjectives (e.g. red is hot, blue is cold). Some of these associations are grounded in visual experiences such as seeing glowing red embers. Surprisingly, despite having no visual experience, many congenitally blind people show very similar color associations which are likely learned through language. We show that these associations are indeed embedded in the statistical structure of language. We apply a projection method to word embeddings trained on corpora of spoken and written language to identify color-adjective associations as they are represented in English. These projections were predictive of color-adjective associations reported by blind and sighted English speakers. The most predictive projections were generated by embeddings derived from a corpus of fiction, which outperformed even the state-of-the-art large language model, GPT-4. By augmenting the training corpora in various ways we discover the types of sentences most responsible for conveying the color-adjective associations to the models. We find that word embedding models learn these associations from indirect (second-order) co-occurrences, and that when prompted, people are able to identify some of the words that are most informative for associating colors with specific adjectives. Learning through linguistic co-occurrences is one way word meanings can be continually aligned across language users despite large variations in perceptual experience.