Does Linguistic Relativity Hypothesis Apply on ChatGPT Responses? Yes, It Does

IF 1.8 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Partha Pratim Ray
{"title":"Does Linguistic Relativity Hypothesis Apply on ChatGPT Responses? Yes, It Does","authors":"Partha Pratim Ray","doi":"10.1111/coin.70103","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>We present the first comprehensive, end-to-end quantitative evaluation of the linguistic relativity hypothesis in AI-generated text, using ChatGPT-4o mini to generate responses to 10 culturally salient prompts across 13 typologically diverse languages. Semantic shifts were quantified using pairwise cosine similarity scores computed from multilingual MiniLM sentence embeddings. A one-way analysis of variance (ANOVA) reveals statistically significant variation in semantic alignment across language pairs, with <span></span><math>\n <semantics>\n <mrow>\n <mi>F</mi>\n <mo>(</mo>\n <mn>77</mn>\n <mo>,</mo>\n <mn>702</mn>\n <mo>)</mo>\n <mo>=</mo>\n <mn>2</mn>\n <mo>.</mo>\n <mn>153</mn>\n </mrow>\n <annotation>$$ F\\left(77,702\\right)=2.153 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>2</mn>\n <mo>.</mo>\n <mn>29</mn>\n <mo>×</mo>\n <mn>1</mn>\n <msup>\n <mrow>\n <mn>0</mn>\n </mrow>\n <mrow>\n <mo>−</mo>\n <mn>7</mn>\n </mrow>\n </msup>\n </mrow>\n <annotation>$$ p=2.29\\times 1{0}^{-7} $$</annotation>\n </semantics></math>, and effect size <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mrow>\n <mi>η</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msup>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>191</mn>\n </mrow>\n <annotation>$$ {\\eta}^2=0.191 $$</annotation>\n </semantics></math>. These results are further supported by a non-parametric Kruskal–Wallis test yielding <span></span><math>\n <semantics>\n <mrow>\n <mi>H</mi>\n <mo>=</mo>\n <mn>176</mn>\n <mo>.</mo>\n <mn>208</mn>\n </mrow>\n <annotation>$$ H=176.208 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>9</mn>\n <mo>.</mo>\n <mn>59</mn>\n <mo>×</mo>\n <mn>1</mn>\n <msup>\n <mrow>\n <mn>0</mn>\n </mrow>\n <mrow>\n <mo>−</mo>\n <mn>10</mn>\n </mrow>\n </msup>\n </mrow>\n <annotation>$$ p=9.59\\times 1{0}^{-10} $$</annotation>\n </semantics></math>, indicating robust differences in distribution. Prompt-specific semantic shifts also exhibit significant variation, as shown by ANOVA results <span></span><math>\n <semantics>\n <mrow>\n <mi>F</mi>\n <mo>(</mo>\n <mn>9</mn>\n <mo>,</mo>\n <mn>770</mn>\n <mo>)</mo>\n <mo>=</mo>\n <mn>24</mn>\n <mo>.</mo>\n <mn>239</mn>\n </mrow>\n <annotation>$$ F\\left(9,770\\right)=24.239 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>1</mn>\n <mo>.</mo>\n <mn>00</mn>\n <mo>×</mo>\n <mn>1</mn>\n <msup>\n <mrow>\n <mn>0</mn>\n </mrow>\n <mrow>\n <mo>−</mo>\n <mn>36</mn>\n </mrow>\n </msup>\n </mrow>\n <annotation>$$ p=1.00\\times 1{0}^{-36} $$</annotation>\n </semantics></math>, and <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mrow>\n <mi>η</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msup>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>221</mn>\n </mrow>\n <annotation>$$ {\\eta}^2=0.221 $$</annotation>\n </semantics></math>. Sentiment polarity analysis using the Polyglot toolkit reveals significant effects of language on sentiment distribution, with <span></span><math>\n <semantics>\n <mrow>\n <mi>F</mi>\n <mo>(</mo>\n <mn>12</mn>\n <mo>,</mo>\n <mn>117</mn>\n <mo>)</mo>\n <mo>=</mo>\n <mn>2</mn>\n <mo>.</mo>\n <mn>637</mn>\n </mrow>\n <annotation>$$ F\\left(12,117\\right)=2.637 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>0037</mn>\n </mrow>\n <annotation>$$ p=0.0037 $$</annotation>\n </semantics></math>, and <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mrow>\n <mi>η</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msup>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>213</mn>\n </mrow>\n <annotation>$$ {\\eta}^2=0.213 $$</annotation>\n </semantics></math>. Disaggregated analysis shows that positivity ratios differ by prompt (<span></span><math>\n <semantics>\n <mrow>\n <mi>F</mi>\n <mo>(</mo>\n <mn>9</mn>\n <mo>,</mo>\n <mn>120</mn>\n <mo>)</mo>\n <mo>=</mo>\n <mn>3</mn>\n <mo>.</mo>\n <mn>621</mn>\n </mrow>\n <annotation>$$ F\\left(9,120\\right)=3.621 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>0005</mn>\n </mrow>\n <annotation>$$ p=0.0005 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mrow>\n <mi>η</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msup>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>214</mn>\n </mrow>\n <annotation>$$ {\\eta}^2=0.214 $$</annotation>\n </semantics></math>), while negativity scores display even greater divergence across prompts with <span></span><math>\n <semantics>\n <mrow>\n <mi>F</mi>\n <mo>(</mo>\n <mn>9</mn>\n <mo>,</mo>\n <mn>120</mn>\n <mo>)</mo>\n <mo>=</mo>\n <mn>12</mn>\n <mo>.</mo>\n <mn>755</mn>\n </mrow>\n <annotation>$$ F\\left(9,120\\right)=12.755 $$</annotation>\n </semantics></math>, <span></span><math>\n <semantics>\n <mrow>\n <mi>p</mi>\n <mo>=</mo>\n <mn>4</mn>\n <mo>.</mo>\n <mn>59</mn>\n <mo>×</mo>\n <mn>1</mn>\n <msup>\n <mrow>\n <mn>0</mn>\n </mrow>\n <mrow>\n <mo>−</mo>\n <mn>14</mn>\n </mrow>\n </msup>\n </mrow>\n <annotation>$$ p=4.59\\times 1{0}^{-14} $$</annotation>\n </semantics></math>, and <span></span><math>\n <semantics>\n <mrow>\n <msup>\n <mrow>\n <mi>η</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msup>\n <mo>=</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>489</mn>\n </mrow>\n <annotation>$$ {\\eta}^2=0.489 $$</annotation>\n </semantics></math>. An unsupervised clustering procedure (<span></span><math>\n <semantics>\n <mrow>\n <mi>k</mi>\n <mo>=</mo>\n <mn>3</mn>\n </mrow>\n <annotation>$$ k=3 $$</annotation>\n </semantics></math>) classifies languages into three distinct groups based on semantic alignment: (i) high-alignment (<span></span><math>\n <semantics>\n <mrow>\n <mtext>mean similarity</mtext>\n <mo>≥</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>90</mn>\n </mrow>\n <annotation>$$ \\mathrm{mean}\\ \\mathrm{similarity}\\ge 0.90 $$</annotation>\n </semantics></math>), (ii) intermediate (<span></span><math>\n <semantics>\n <mrow>\n <mtext>mean similarity</mtext>\n <mo>≈</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>75</mn>\n <mo>−</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>85</mn>\n </mrow>\n <annotation>$$ \\mathrm{mean}\\ \\mathrm{similarity}\\approx 0.75-0.85 $$</annotation>\n </semantics></math>), and (iii) neutral-tone clusters. Each group exhibits distinctive polarity profiles, with median sentiment polarity ranging from <span></span><math>\n <semantics>\n <mrow>\n <mo>−</mo>\n <mn>0</mn>\n <mo>.</mo>\n <mn>02</mn>\n </mrow>\n <annotation>$$ -0.02 $$</annotation>\n </semantics></math> to <span></span><math>\n <semantics>\n <mrow>\n <mn>0</mn>\n <mo>.</mo>\n <mn>11</mn>\n </mrow>\n <annotation>$$ 0.11 $$</annotation>\n </semantics></math>. These results demonstrate that linguistic structures exert a measurable influence on AI-generated content, underscoring the need for culturally sensitive AI design practices. These results affirm that ChatGPT-4o mini's outputs align with the linguistic relativity hypothesis, clearly illustrating that language structures significantly shape AI-driven interpretation All associated code and data are available in the GitHub repository: \nhttps://github.com/ParthaPRay/Liguistic_Relativity_Chatgpt.</p>\n </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 4","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/coin.70103","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

We present the first comprehensive, end-to-end quantitative evaluation of the linguistic relativity hypothesis in AI-generated text, using ChatGPT-4o mini to generate responses to 10 culturally salient prompts across 13 typologically diverse languages. Semantic shifts were quantified using pairwise cosine similarity scores computed from multilingual MiniLM sentence embeddings. A one-way analysis of variance (ANOVA) reveals statistically significant variation in semantic alignment across language pairs, with F ( 77 , 702 ) = 2 . 153 $$ F\left(77,702\right)=2.153 $$ , p = 2 . 29 × 1 0 7 $$ p=2.29\times 1{0}^{-7} $$ , and effect size η 2 = 0 . 191 $$ {\eta}^2=0.191 $$ . These results are further supported by a non-parametric Kruskal–Wallis test yielding H = 176 . 208 $$ H=176.208 $$ , p = 9 . 59 × 1 0 10 $$ p=9.59\times 1{0}^{-10} $$ , indicating robust differences in distribution. Prompt-specific semantic shifts also exhibit significant variation, as shown by ANOVA results F ( 9 , 770 ) = 24 . 239 $$ F\left(9,770\right)=24.239 $$ , p = 1 . 00 × 1 0 36 $$ p=1.00\times 1{0}^{-36} $$ , and η 2 = 0 . 221 $$ {\eta}^2=0.221 $$ . Sentiment polarity analysis using the Polyglot toolkit reveals significant effects of language on sentiment distribution, with F ( 12 , 117 ) = 2 . 637 $$ F\left(12,117\right)=2.637 $$ , p = 0 . 0037 $$ p=0.0037 $$ , and η 2 = 0 . 213 $$ {\eta}^2=0.213 $$ . Disaggregated analysis shows that positivity ratios differ by prompt ( F ( 9 , 120 ) = 3 . 621 $$ F\left(9,120\right)=3.621 $$ , p = 0 . 0005 $$ p=0.0005 $$ , η 2 = 0 . 214 $$ {\eta}^2=0.214 $$ ), while negativity scores display even greater divergence across prompts with F ( 9 , 120 ) = 12 . 755 $$ F\left(9,120\right)=12.755 $$ , p = 4 . 59 × 1 0 14 $$ p=4.59\times 1{0}^{-14} $$ , and η 2 = 0 . 489 $$ {\eta}^2=0.489 $$ . An unsupervised clustering procedure ( k = 3 $$ k=3 $$ ) classifies languages into three distinct groups based on semantic alignment: (i) high-alignment ( mean similarity 0 . 90 $$ \mathrm{mean}\ \mathrm{similarity}\ge 0.90 $$ ), (ii) intermediate ( mean similarity 0 . 75 0 . 85 $$ \mathrm{mean}\ \mathrm{similarity}\approx 0.75-0.85 $$ ), and (iii) neutral-tone clusters. Each group exhibits distinctive polarity profiles, with median sentiment polarity ranging from 0 . 02 $$ -0.02 $$ to 0 . 11 $$ 0.11 $$ . These results demonstrate that linguistic structures exert a measurable influence on AI-generated content, underscoring the need for culturally sensitive AI design practices. These results affirm that ChatGPT-4o mini's outputs align with the linguistic relativity hypothesis, clearly illustrating that language structures significantly shape AI-driven interpretation All associated code and data are available in the GitHub repository: https://github.com/ParthaPRay/Liguistic_Relativity_Chatgpt.

语言相对论假说是否适用于聊天答题?是的,有
0005 $$, η 2 = 0。214 $$ {\eta}^2=0.214 $$),而消极得分在F(9,120) = 12的提示中显示出更大的差异。755 $$ $ \左(9,120\右)=12.755 $$,p = 4。$$ p=4.59\times 1{0}^{-14} $$,η 2 = 0。$$ {\eta}^2=0.489 $$。一个无监督聚类过程(k=3 $$ k=3 $$)基于语义对齐将语言分为三个不同的组:(i)高对齐(平均相似度≥0)。90 $$ \mathrm{mean}\ \mathrm{similarity}\ge 0.90 $$), (ii)中间(mean similarity≈0。75−0。85 $$ \mathrm{mean}\ \mathrm{similarity}\约0.75-0.85 $$),以及(iii)中性音调集群。每个组都表现出不同的极性特征,情感极性的中位数范围为- 0。02 $$ -0.02 $$到0。11 $$ 0.11 $$。这些结果表明,语言结构对人工智能生成的内容产生了可衡量的影响,强调了对文化敏感的人工智能设计实践的必要性。这些结果证实了chatgpt - 40mini的输出符合语言相对论假设,清楚地说明了语言结构显著地塑造了人工智能驱动的解释。所有相关代码和数据都可以在GitHub存储库中获得:https://github.com/ParthaPRay/Liguistic_Relativity_Chatgpt。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Intelligence
Computational Intelligence 工程技术-计算机:人工智能
CiteScore
6.90
自引率
3.60%
发文量
65
审稿时长
>12 weeks
期刊介绍: This leading international journal promotes and stimulates research in the field of artificial intelligence (AI). Covering a wide range of issues - from the tools and languages of AI to its philosophical implications - Computational Intelligence provides a vigorous forum for the publication of both experimental and theoretical research, as well as surveys and impact studies. The journal is designed to meet the needs of a wide range of AI workers in academic and industrial research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信