语义变化与词语嵌入——以葡萄牙语历时性为例

IF 0.2 0 LANGUAGE & LINGUISTICS
Lucas Lage, Evandro Cunha
{"title":"语义变化与词语嵌入——以葡萄牙语历时性为例","authors":"Lucas Lage, Evandro Cunha","doi":"10.17851/2237-2083.30.4.2043-2086","DOIUrl":null,"url":null,"abstract":": According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.","PeriodicalId":42188,"journal":{"name":"Revista de Estudos da Linguagem","volume":null,"pages":null},"PeriodicalIF":0.2000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mudança semântica e word embeddings: estudos de caso na diacronia do português/ Semantic change and word embeddings: case studies on the diachrony of Portuguese\",\"authors\":\"Lucas Lage, Evandro Cunha\",\"doi\":\"10.17851/2237-2083.30.4.2043-2086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.\",\"PeriodicalId\":42188,\"journal\":{\"name\":\"Revista de Estudos da Linguagem\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.2000,\"publicationDate\":\"2022-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista de Estudos da Linguagem\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17851/2237-2083.30.4.2043-2086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista de Estudos da Linguagem","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17851/2237-2083.30.4.2043-2086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

摘要

:根据Givón(2001)的说法,词典是一个概念库,这些概念在时间上相对稳定,社会共享,编码良好。它们被很好地组织在一个网络中,在这个网络中,相似的概念被分组在一起。同样,词典编纂者Georges Matoré提出了单词之间的联想关系,并定义了概念域和证明词的概念,这是词典的组织元素。使用诸如单词嵌入之类的计算技术,将单词表示为向量空间中的向量,可以根据单词的语义特征来分析单词分组。本文旨在探讨这种方法在语义变化中的可行性。在第谷·布拉赫的葡萄牙语语料库中,分析了“deus”、“homem”、“mulher”、“pai”、“mae”和“terra”等单词形式的出现情况。使用Skip gram算法创建单词嵌入,并在三个不同的时间片中为每个单词创建语义特征网络的可视化。通过生成的可视化观察到了词典的语义组织及其重组的证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Mudança semântica e word embeddings: estudos de caso na diacronia do português/ Semantic change and word embeddings: case studies on the diachrony of Portuguese
: According to Givón (2001), the lexicon is a repository of concepts which are relatively stable in time, socially shared and well encoded. They are well organized in a network where similar concepts are grouped next to each other. On a similar note, the lexicographer Georges Matoré proposes associative relationships between words and defines the concepts of notional field and testimonial words, which are organizational elements of the lexicon. Using computational techniques such as Word Embeddings, which represent words as vectors in a vector space, it is possible to analyze groupings of words based on their semantic features. This paper aims to explore the viability of such methods in semantic change. The occurrences of the word forms “deus”, “homem”, “mulher”, “pai”, “mae” and “terra” were analyzed in the Tycho Brahe corpus for Portuguese. Word Embeddings were created using the Skip-gram algorithm, and visualizations for a semantic feature network were created for each word in three different time slices. Evidence of the semantic organization of the lexicon and its reorganization was observed through the generated visualizations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Revista de Estudos da Linguagem
Revista de Estudos da Linguagem LANGUAGE & LINGUISTICS-
CiteScore
0.30
自引率
0.00%
发文量
55
审稿时长
52 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信