词、结构和语料库:汉语空间粒子结构语义的网络表示

IF 1 2区 文学 0 LANGUAGE & LINGUISTICS
Alvin Cheng-Hsien Chen
{"title":"词、结构和语料库:汉语空间粒子结构语义的网络表示","authors":"Alvin Cheng-Hsien Chen","doi":"10.1515/cllt-2020-0012","DOIUrl":null,"url":null,"abstract":"Abstract In this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.","PeriodicalId":45605,"journal":{"name":"Corpus Linguistics and Linguistic Theory","volume":"18 1","pages":"209 - 235"},"PeriodicalIF":1.0000,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/cllt-2020-0012","citationCount":"6","resultStr":"{\"title\":\"Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles\",\"authors\":\"Alvin Cheng-Hsien Chen\",\"doi\":\"10.1515/cllt-2020-0012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract In this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.\",\"PeriodicalId\":45605,\"journal\":{\"name\":\"Corpus Linguistics and Linguistic Theory\",\"volume\":\"18 1\",\"pages\":\"209 - 235\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2020-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/cllt-2020-0012\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Corpus Linguistics and Linguistic Theory\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1515/cllt-2020-0012\",\"RegionNum\":2,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpus Linguistics and Linguistic Theory","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/cllt-2020-0012","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 6

摘要

摘要在本研究中,我们旨在证明网络科学从语言单元之间的连通性和关系来探索结构语义的出现的有效性。以普通话方位结构(MLCs)为例,我们从一个具有代表性的语料库中提取了结构表征,包括它们各自的空间粒子(SP)和地标的头名词(LMs),它们构成了网络的节点。我们基于从大型文本语料库中学习到的单词嵌入的词汇相似性和从搭配分析中获得的SP-LM偶然性来计算边缘。我们解决了三个问题:(1)对于每个LM,SP的意义有多典型?(2) 对于每个SP,其LM示例在语义上的内聚性如何?(3) MLC结构网络中出现的语义领域是什么?我们通过在三个层面上研究网络的定量性质来解决这些问题:微观性质(即节点中心性和局部聚类系数)、介观性质(即社区)和宏观性质(即小世界性和无标度)。我们的网络分析将重复的语言体验在语言知识的形成和巩固中的重要性带到了前台。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles
Abstract In this study, we aim to demonstrate the effectiveness of network science in exploring the emergence of constructional semantics from the connectedness and relationships between linguistic units. With Mandarin locative constructions (MLCs) as a case study, we extracted constructional tokens from a representative corpus, including their respective space particles (SPs) and the head nouns of the landmarks (LMs), which constitute the nodes of the network. We computed edges based on the lexical similarities of word embeddings learned from large text corpora and the SP-LM contingency from collostructional analysis. We address three issues: (1) For each LM, how prototypical is it of the meaning of the SP? (2) For each SP, how semantically cohesive are its LM exemplars? (3) What are the emerging semantic fields from the constructional network of MLCs? We address these questions by examining the quantitative properties of the network at three levels: microscopic (i.e., node centrality and local clustering coefficient), mesoscopic (i.e., community) and macroscopic properties (i.e., small-worldness and scale-free). Our network analyses bring to the foreground the importance of repeated language experiences in the shaping and entrenchment of linguistic knowledge.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.20
自引率
12.50%
发文量
15
期刊介绍: Corpus Linguistics and Linguistic Theory (CLLT) is a peer-reviewed journal publishing high-quality original corpus-based research focusing on theoretically relevant issues in all core areas of linguistic research, or other recognized topic areas. It provides a forum for researchers from different theoretical backgrounds and different areas of interest that share a commitment to the systematic and exhaustive analysis of naturally occurring language. Contributions from all theoretical frameworks are welcome but they should be addressed at a general audience and thus be explicit about their assumptions and discovery procedures and provide sufficient theoretical background to be accessible to researchers from different frameworks. Topics Corpus Linguistics Quantitative Linguistics Phonology Morphology Semantics Syntax Pragmatics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信